Newer
Older
sisyphe-go / README.md
![sisyphe](./docs/logo-sisyphe.jpg)

## Sisyphe-GO

Sisyphe is a generic Golang recursive folder analyser terminal application

![Sisyphe-pic](./docs/sisyphe.gif)

### Requirements

Tested with Golang 1.17

Works on Linux/OSX/Windows

Mount a corpus folder and :

```bash
docker-compose up -d
docker exec sisyphe_go_go_1 -it go run . -n corpusname -c corpuspath -o outputpath
```

### Install it on local

1. Download the latest Sisyphe-go version
2. Just do : `go build .` 
3. ... that's it.

### Help

`go run . --help` Will output help

### Options

    --help  Output usage
    -c      Configuration folder path
    -n      Corpus name (default "test")
    -o      Output directory where results are written
    -p      Corpus path
    -w      Counting word on pdf

### How it works ?

Just start Sisyphe on a folder with any files in it.

`go run . ~/Documents/customfolder/corpus -n corpusname -o outputpath`

`go run . ~/Documents/customfolder/corpus -n corpusname -c ~/Documents/customfolder/corpusResources -o outputpath`

Sisyphe is now working in background with all your computer thread.
Just take a coffee and wait , it will prevent you when it's done :)

The results of sisyphe are present @ `sisyphe/out/{timestamp}-corpusname/` (errors,info,duration..)

![Sisyphe-dashboard](./docs/sisyphe-monitor.gif)

### Modules

There is a list of default modules (focused on xml & pdf).

Those URL NEED to be updated when merge branch will be ok.

-   [FILETYPE](https://github.com/istex/sisyphe/tree/master/src/worker/filetype) Will detect mimetype,extension, corrupted files..
-   [PDF](https://github.com/istex/sisyphe/tree/master/src/worker/pdf) Will get info from PDF (version, author, meta...)
-   [XML](https://github.com/istex/sisyphe/tree/master/src/worker/xml) Will check if it's wellformed, valid-dtd's, get elements from balises ...
-   [XPATH](https://github.com/istex/sisyphe/tree/master/src/worker/xpath) Will generate a complete list of xpaths from submitted folder
-   [OUT](https://github.com/istex/sisyphe/tree/master/src/worker/out) Will export data to json file & ElasticSearch database