Sisyphe-go is a golang command line application for recursive analysis of directories and files of scientific publishing corpus
example | 5 months ago | ||
kibanatemplates | 8 months ago | ||
nginx.conf.d | 2 years ago | ||
.dockerignore | 2 years ago | ||
.gitignore | 1 year ago | ||
Dockerfile | 2 years ago | ||
README.md | 2 years ago | ||
docker-compose.yml | 2 years ago | ||
go.mod | 2 years ago | ||
go.sum | 2 years ago | ||
indexCorpus.sh | 2 years ago | ||
main.go | 9 months ago | ||
pdf.go | 2 years ago | ||
pdf_test.go | 2 years ago | ||
struct.go | 5 months ago | ||
util.go | 2 years ago | ||
xml.go | 8 months ago | ||
xml_test.go | 5 months ago | ||
xpath.sh | 2 years ago |
Sisyphe-go is a generic Golang recursive folder analyser terminal application
Tested with Golang 1.18
Works on Linux/OSX/Windows
Create and fill in the following environment variables on the host machine
Generic analysis
docker-compose up -d docker exec -t sisyphe-go_go_1 go run . -n corpusName -p corpusPath -o outputPath
Detailed analysis
docker exec -t sisyphe-go_go_1 go run . -n corpusName -c corpusResourcesPath -p corpusPath -o outputPath
Example:
docker exec -t sisyphe-go_go_1 go run . -n karger-ebooks-2022-08-08-detaillee -c /corpus-resources -p /work/sample/karger_2020_11_06
By default the program will write its results in SISYPHE_OUT
go build .
go run . --help
Will output help
--help Output usage -c Configuration folder path -n Corpus name (default "test") -o Output directory where results are written -p Corpus path -w Counting word on pdf -noanalyze Disable analysis -noindex Disable indexation -noxpath Disable xpaths.csv file generation -noattval xpaths.csv without attribute value
Just start Sisyphe-go on a folder with any files in it.
go run . -p ~/Documents/customfolder/corpus -n corpusname -o outputpath
go run . -p ~/Documents/customfolder/corpus -n corpusname -c ~/Documents/customfolder/corpusResources -o outputpath
Sisyphe-go is now working in background with all your computer thread. Just take a coffee and wait , it will prevent you when it's done :)
The results of sisyphe-go are present in outputpath/{timestamp}-corpusName/
(errors,info,duration..)
Just run go test
For cover go test -cover
pdftotext
and pdfinfo
)xmlstarlet
and xmllint
)