web-services / nlp-tools /
@Nicolas Thouvenin Nicolas Thouvenin authored on 1 Sep 2023
..
data examples.http 1 year ago
test_EZ examples.http 1 year ago
v1 add swagger for nlp-tools 1 year ago
README.md spawn 2 years ago
analyze.py examples.http 1 year ago
examples.http swagger 1 year ago
requirements.txt new version 1 year ago
swagger.json use domain to ensure the proxy is not used 1 year ago
README.md

nlp-tools

Service pour l'interrogation de la boite à outil nlptools.

nlptools est bibliothèque d'outils pour le traitement NLP (construite au dessus de https://spacy.io/)

Liste des traitements NLP disponibles :

Engine français anglais name
Stemming X X stemmer
Etiquettage en partie du discours X X POStagger
Reconnaissance de termes contrôlés X termMatcher
Chunking nominal X NPchunker
Chunking nominal issu d'une analyse en dépendance X NPchunkerDP


Deux types de sortie (output) sont disponibles pour chaque traitement.
Le résultat est présenté soit :

  • présenté sous la forme du texte d'origine (doc)
  • sous la forme d'un structure json (json)

Interrogation du WebService

Syntaxe des URLs :

https://nlp-tools-2.services.inist.fr/v1/{langue}/{engine}/analyze?output={val}

{langue} = la langue à analyser           [en , fr]
{engine} = nom pipeline de traitement à appliquer :
anglais :           [stemmer, postagger, npchunker, npchunkerdp, termmatcher]
francais :          [stemmer , postagger]

  • paramètres :
    {output} = format du résultat           [doc , json]
                     doc = le resultat est reinséré dans le document
                     json = le resultat de l'analyse au format json
  • Code retour :
    200 si OK
    404 si service non contacté

Exemple d'appel du POStagger, sortie doc :

cat data/data_en.json | curl --proxy "" -X POST --data-binary @- "https://nlp-tools-2.services.inist.fr/v1/en/postagger/analyze?indent=true&output=doc"

Exemple d'appel du termmatcher, sortie doc :

cat <<EOF | curl --proxy "" -X POST --data-binary @-  "https://nlp-tools-2.services.inist.fr/v1/en/termmatcher/analyze?indent=true&output=json"
  [{
 "idt":"08-0245642","value":"Random walk of passive tracers among randomly moving obstacles. Background: This study is mainly motivated by the need of understanding how the diffusion behaviour of a biomolecule (or even of a larger object) is affected by other moving macromolecules, organelles, and so on, inside a living cell, whence the possibility of understanding whether or not a randomly walking biomolecule is also subject to a long-range force field driving it to its target. Method: By means of the Continuous Time Random Walk (CTRW) technique the topic of random walk in random environment is here considered in the case of a passively diffusing particle in a crowded environment made of randomly moving and interacting obstacles. Results: The relevant physical quantity which is worked out is the diffusion cofficient of the passive tracer which is computed as a function of the average inter-obstacles distance. Coclusions: The results reported here suggest that if a biomolecule, let us call it a test molecule, moves towards its target in the presence of other independently interacting molecules, its motion can be considerably slowed down. Hence, if such a slowing down could compromise the efficiency of the task to be performed by the test molecule, some accelerating factor would be required. Intermolecular electrodynamic forces are good candidates as accelerating factors because they can act at a long distance in a medium like the cytosol despite its ionic strength."
 },{
 "idt":"08-040289","value":"Planck 2015 results. XIII. Cosmological parameters.We present results based on full-mission Planck observations of temperature and polarization anisotropies of the CMB. These data are consistent with the six-parameter inflationary LCDM cosmology. From the Planck temperature and lensing data, for this cosmology we find a Hubble constant, H0= (67.8 +/- 0.9) km/s/Mpc, a matter density parameter Omega_m = 0.308 +/- 0.012 and a scalar spectral index with n_s = 0.968 +/- 0.006. (We quote 68% errors on measured parameters and 95% limits on other parameters.) Combined with Planck temperature and lensing data, Planck LFI polarization measurements lead to a reionization optical depth of tau = 0.066 +/- 0.016. Combining Planck with other astrophysical data we find N_ eff = 3.15 +/- 0.23 for the effective number of relativistic degrees of freedom and the sum of neutrino masses is constrained to < 0.23 eV. Spatial curvature is found to be |Omega_K| < 0.005. For LCDM we find a limit on the tensor-to-scalar ratio of r <0.11 consistent with the B-mode constraints from an analysis of BICEP2, Keck Array, and Planck (BKP) data. Adding the BKP data leads to a tighter constraint of r < 0.09. We find no evidence for isocurvature perturbations or cosmic defects. The equation of state of dark energy is constrained to w = -1.006 +/- 0.045. Standard big bang nucleosynthesis predictions for the Planck LCDM cosmology are in excellent agreement with observations. We investigate annihilating dark matter and deviations from standard recombination, finding no evidence for new physics. The Planck results for base LCDM are in agreement with BAO data and with the JLA SNe sample. However the amplitude of the fluctuations is found to be higher than inferred from rich cluster counts and weak gravitational lensing. Apart from these tensions, the base LCDM cosmology provides an excellent description of the Planck CMB observations and many other astrophysical data sets."
},{
 "idt":"06-0488289","value":"Weyl gravity and Cartan geometry. We point out that the Cartan geometry known as the second-order conformalstructure provides a natural differential geometric framework underlying gaugetheories of conformal gravity. We are concerned by two theories: the first onewill be the associated Yang-Mills-like Lagrangian, while the second, inspiredby J.T. Wheeler in Phys. Rev. D90 (2014), will be a slightly more general one which will relax theconformal Cartan geometry. The corresponding gauge symmetry is treated withinthe BRST language. We show that the Weyl gauge potential is a spurious degreeof freedom, analogous to a Stueckelberg field, that can be eliminated throughthe dressing field method. We derive sets of field equations for both thestudied Lagrangians. For the second one, they constrain the gauge field to bethe normal conformal Cartan connection. Finally, we provide in a Lagrangianframework a justification of the identification, in dimension $4$, of the Bachtensor with the Yang-Mills current of the normal conformal Cartan connection,as proved in Class"
}]
EOF

Test d'intégation EZmaster like

cd public
sed -e '1d; $d' ../data/data_en.json | sed 's/\,$/ /g' | python3 analyze.py stemmer -o doc -lang en

Usage :

analyze.py [-h] [-ini INIT_FILE] [-log LOG] [-lang {fr,en}] [-param PARAM] [-o doc]
                  {stemmer,termMatcher,NPchunker,POStagger,gazetteer,NPchunkerDP,lefff_tagger}

positional arguments:
  {stemmer,termMatcher,NPchunker,POStagger,gazetteer,NPchunkerDP,lefff_tagger}
                        Name oh the NLPpipe

optional arguments:
  -h, --help            show this help message and exit
  -ini INIT_FILE, --init-file INIT_FILE
                        initialisation file [default config.ini]
  -log LOG, --log LOG   log file
  -lang {fr,en}, --language {fr,en}
                        language
  -param PARAM, --param PARAM
                        initialisation param in json
  -o doc, --output doc  Format result