add DATA_VERSION 2 to froggen.settings#7
add DATA_VERSION 2 to froggen.settings#7tcbrouwer wants to merge 1 commit intoLanguageMachines:masterfrom
Conversation
|
Well, although this trick works, it would be a much better idea to regenerate the data files to be used with the newest Mbt. (main improvement being that it better handles Unicode issues. @tcbrouwer For the time being, your hack is sufficient. A regenerated Mbt file most likely just gives minor different results (like in Confidence values) |
|
@tcbrouwer Thanks for the pull request. I didn't realize that this was now broken out-of-the-box, we definitely need to release a fixed frogdata then. Your fix looks like a quick and acceptable one, unless we indeed want to regenerate it properly like @kosloot suggests. Done! The best source however is https://github.com/INL/nederlab-linguistic-enrichment/tree/master/resources (it's a private repo at INT but you probably still have access there from back then). This was all done in the scope of the Nederlab project in 2018.
That's this one frog-bab-cgn. I think the source materialis non-free unfortunately, hence the private repo. |
|
the 'dum' data is now updated. |
|
This is now releasedin frogdata v0.21 , but nld-vnn remains to be done so the current state is not ideal |
I tried to run frog-dum and frog-nld-vnn from docker.
docker run proycon/frog -c /usr/share/frog/dum/frog.cfgThis results in
For now I fix this by extending the docker image and copying custom setting files with the DATA VERSION prepended.
I do however not know whether the files are actually 'DATA_VERSION' 2. If they are, this PR seems to fix the problem. Otherwise I need another solution.