Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Lucene.Net.Analysis.OpenNLP to target .net core 6, in addition to .net framework #852

Closed
wants to merge 11 commits into from

Conversation

laimis
Copy link
Contributor

@laimis laimis commented May 11, 2023

Closes #851

@laimis laimis requested review from rclabo and removed request for rclabo May 11, 2023 21:08
…61 is selected and net7.0 when net5.0 is selected. In CI, we set IsTestProject=false and IsPublishable=false to skip these tests.
…and net6.0 for Lucene.Net.Tests.Analysis.OpenNLP tests.
…Services.Unsafe for netstandard2.0 and net462 to ensure the version will work with any combination of Lucene.Net components. This is a transitive dependency in a few 3rd party DLLs, but there may be version conflicts if this isn't done on .NET Framework.
…or netstandard2.0 and net462, since it is being used in Lucene.Net.Facet.
@NightOwl888
Copy link
Contributor

@laimis - I have fixed the dependency conflict. But this still doesn't include the changes for <MavenReference>. Since that basically builds on this, I will create a branch of this branch to add those changes. If it doesn't work, we can merge this branch instead.

@NightOwl888
Copy link
Contributor

Okay, looks like MavenReference is a bust for now: ikvmnet/ikvm-maven#10 (comment).

The fix for that is similar to how we do the file locking in Lucene.NET on macOS in the NativeFSLockFactory. Unfortunately, IKVM requires nearly 100 GB to build, and I don't have that kind of disk space to spare. But if you want to have a go at it, you are welcome.

We have yet another issue with this addition - referencing IKVM adds about 4 minutes of build time to Azure DevOps and will almost certainly slow down the GitHub Actions builds as well. Unfortunately, the size of the IKVM NuGet package and dependencies has exploded. In both Azure DevOps and GitHub Actions, there are options for caching NuGet packages so they don't have to be downloaded per build. We used to have it set up on Azure DevOps, but it wasn't set up to invalidate the cache when .build/dependencies.props changes and I couldn't figure out how to get it cleared after versions of dependencies were updated. So, we will need to investigate this, also. The cache key we need to use should include all *.props, *.targets and *.csproj files.

@laimis
Copy link
Contributor Author

laimis commented May 19, 2023

Nice, thanks for the changes there, it looks like it successfully completed on azure:

https://dev.azure.com/lucene-net-temp4/Main/_build?definitionId=1

Could we use the changes made in this branch as is to release Analysis.OpenNLP? I am not following why is IKVM needed. I definitely have the space/resources to build it, just not sure what is needed.

@laimis laimis marked this pull request as ready for review June 1, 2023 16:04
@GAInTheHouse
Copy link

@laimis @NightOwl888 would y'all be able to help me understand what's the road map for OpenNLP to target .Net 6?

@NightOwl888
Copy link
Contributor

@laimis @NightOwl888 would y'all be able to help me understand what's the road map for OpenNLP to target .Net 6?

Unfortunately, this is blocked by the fact that we don't really want to use OpenNLP.NET for this, but instead use <MavenReference>, which will transitively be added to everyone who depends on Lucene.Net.Analysis.OpenNLP. Basically, instead of including a compiled DLL in the package for opennlp-tools, it will compile it on your machine when you depend on Lucene.Net.Analysis.OpenNLP and it will resolve any other Maven dependencies that you pull into your project to fix versioning conflicts between common dependencies.

However, I have updated this branch (rebased against the master branch) and have built the packages here. Download the nuget artifact and unzip it, and you can then put the .nupkg files into a local directory and add it as a package source. That should at least get you up and running for now.

The tests we have for the OpenNLP support work, but I am not sure the entire opennlp-tools package works on macOS or iOS. It won't if there is any file locking in the application, but I haven't checked the opennlp source to determine if they are doing so. It should work on Windows and Linux, though.

As for an official release, we will do so as soon as there is a version of IKVM that supports both macOS and .NET Framework, but so far there isn't one and the latest version has some annoying initialization exceptions if you don't explicitly call new object() before you call into an IKVM-converted library. IKVM has come a long way in the past 18 months, but it just isn't there yet.

@NightOwl888
Copy link
Contributor

@GAInTheHouse

It turns out it was possible to overcome the issues with MavenReference with some due diligence. I have submitted #892, which should resolve this (even on macOS). You can download the latest update with these changes here:

https://dev.azure.com/shad0962/Experiments/_build/results?buildId=2242&view=artifacts&pathAsName=false&type=publishedArtifacts

A few things to note:

  • Sometimes it may be necessary to physically delete your bin and obj folders in order to fix IKVM build issues, and these build issues (such as type initialization errors) often look like runtime issues.
  • IKVM will download several GB of dependencies onto your build machine. So, it is important to ensure there is enough space.
  • Nearly all issues you are likely to encounter will either be with IKVM or OpenNLP (the Java project), so please report them to the appropriate party. Of course, there could still be a problem with the Lucene.Net.Analysis.OpenNLP project, and we would like to hear feedback if you do find a problem with it.
  • Since there will likely be a learning curve to get into the weeds with OpenNLP, it would be appreciated if you contribute what you learn as documentation back to this project so it can help others get up to speed more easily (as you pointed out in Scarce Documentation for OpenNLP Integration #890, there isn't much to go on). Documentation for OpenNLP can be provided via Markdown in the following file: https://github.com/apache/lucenenet/blob/master/src/Lucene.Net.Analysis.OpenNLP/overview.md. Additional documentation can be provided on class and members via inline XML doc comments. Documents can also be added as Markdown files and then added to the /websites/apidocs folder. There is some information about how to build the docs here: https://lucenenet.apache.org/contributing/documentation.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add .NET Core support for Lucene.Net.Analysis.OpenNLP
3 participants