Newer
Older
sisyphe-go / README.md

## Sisyphe-GO

Sisyphe-go is a generic Golang recursive folder analyser terminal application


### Requirements

Tested with Golang 1.18

Works on Linux/OSX/Windows

Create and fill in the following environment variables on the host machine
- WORK
- CORPUS_RESOURCES
- SISYPHE_OUT
- ELASTIC_URL
- ELASTIC_PORT
- KIBANA_PORT
- UID=$(id -u)
- GID=$(id -g)


### Execution

Generic analysis
```bash
docker-compose up -d
docker exec -t sisyphe_go_go_1 go run . -n corpusName -p corpusPath -o outputPath
```

Detailed analysis
```bash
docker exec -t sisyphe_go_go_1 go run . -n corpusName -c corpusResourcesPath -p corpusPath -o outputPath
```

Example:
```bash
docker exec -t sisyphe-go_go_1 go run . -n karger-ebooks-2022-08-08-detaillee -c /corpus-resources -p /work/sample/karger_2020_11_06
```

By default the program will write its results in SISYPHE_OUT


### Install it on local

1. Download the latest Sisyphe-go version
2. Just do : `go build .`
3. ... that's it.

### Help

`go run . --help` Will output help

### Options

    --help      Output usage
    -c          Configuration folder path
    -n          Corpus name (default "test")
    -o          Output directory where results are written
    -p          Corpus path
    -w          Counting word on pdf
    -noanalyze  Disable analysis
    -noindex    Disable indexation
    -noxpath    Disable xpaths.csv file generation
    -noattval   xpaths.csv without attribute value

### How it works ?

Just start Sisyphe-go on a folder with any files in it.

`go run . -p ~/Documents/customfolder/corpus -n corpusname -o outputpath`

`go run . -p ~/Documents/customfolder/corpus -n corpusname -c ~/Documents/customfolder/corpusResources -o outputpath`

Sisyphe-go is now working in background with all your computer thread.
Just take a coffee and wait , it will prevent you when it's done :)

The results of sisyphe-go are present in `outputpath/{timestamp}-corpusName/` (errors,info,duration..)

### Test

Just run
`go test`

For cover
`go test -cover`

### Modules

- PDF
    Usage of pdf lib (`pdftotext` and `pdfinfo`)
- XML
    Usage of xml lib (`xmlstarlet` and `xmllint`)