Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate issues for NRIS-EPD and NRIS-EMLI Importers #1261

Open
Keegnan opened this issue Jul 24, 2024 · 5 comments
Open

Investigate issues for NRIS-EPD and NRIS-EMLI Importers #1261

Keegnan opened this issue Jul 24, 2024 · 5 comments
Assignees
Labels

Comments

@Keegnan
Copy link

Keegnan commented Jul 24, 2024

Problem Description
The imports are taking up to 19 hours. They are coming up with status failed and are not sure if the records are writing or not.

  • EPD is taking 9 hours and failing

  • EMLI is taking 3 hours and failing

In order to (achieve some goal), (a system or persona) needs to (some action).

Solution Needs

  • Find out why they are coming back with a status failed and why imports are taking so long.

Timebox

  • About 2 days. If no answers by then we can reassess.

Outcome
Details describing the outcome of the research

  • Understand the cause and what steps need to be taken to resolve the import status and why taken so much time
  • Understand if this is fixable by sustainment (within our domain of work)

Additional Context

@Keegnan Keegnan added the Spike label Jul 24, 2024
@Keegnan Keegnan changed the title Investigate issues for NRIS-EPD and NRIS-EMLI importers Investigate issues for NRIS-EPD and NRIS-EMLI Importers Jul 24, 2024
@sggerard
Copy link
Contributor

I believe both EMLI & EPD are taking a long time and have a failed status due to problems with the NRISWS API. It appears that the API becomes slow and starts throwing 500 errors late at night/early morning. I was able to replicate these errors in postman at 9pm.

It likely doesn't help if both our TEST and PROD environments CRON jobs end up overlapping.

Additionally the EMLI integration isn't actually processing any records at the moment due to the inspection sub types changing which will be fixed in ticket #1238

Image

Image

Image

@sggerard
Copy link
Contributor

Both Test and Prod show these errors only since replacing the COCOCOLA server with new hardware.

The main times these errors seem to occur is from 8-10pm and 3-7am.

Image

Image

@sggerard
Copy link
Contributor

I have created a service desk ticket https://apps.nrs.gov.bc.ca/int/jira/servicedesk/customer/portal/1/SD-119888

@sggerard
Copy link
Contributor

sggerard commented Aug 1, 2024

Kris Clarke has been investigating the issues raised in my ticket and has doubled the JVM memory of the NRIS-WS API. Will keep an eye on the logs and see if this has any performance improvement.

@sggerard
Copy link
Contributor

sggerard commented Aug 6, 2024

After the memory increase we are no longer seeing any 500 errors!

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants