my department, which scans and manages documents, has finally begun to implement an OCR program after literally 7+ years of working on finding and contracting a vendor for it. The software the vendor delivered is borked in so, so many ways, even my supervisor who has been put in charge of implementing and training us on the new software is absolutely shocked by how incompetent this is.
so, among the many quirks and glitches i have encountered, the one that probably has me the most absolutely sure that we are being scammed is how it handles DOBs for people who are 90+
- i click into DOB text field
- i click and drag on the document image to create a box around the text of the pt's DOB. the text on the page was something like 10/10/34
- the text 10/10/34 appears in the DOB text field
3a. alternatively, if i just typed "10/10/34" into the DOB text field it would do the same thing. - i click the button that causes the software to process the raw text string into a piece of data that represents a date
- the date it spits out is 10/10/2034
so, for context, there is never ever any reason why any date, either a DOB or a document date (every single document has one of each of these), would be in the future. why on earth would the software not be able to reference whatever today's date is according to the machine its running on and make sure that its not spitting out a future date?
my instinct tells me that they did not implement a proper date system that can function in perpetuity, but rather assigned a finite range of dates for the software to choose from when interpretting MM/DD/YY. That instead of having the software dynamically reference the computer date, it instead has a static set of if/then statements that it compares YY to: is YY >35? Its 20th centure, add "19". is YY < 35? its 21st century, add "20".
if this is the case, it causes a slight problem right now when interpretting DOBs of people over 90, but if we keep using the software past the maximum date range for deciding "this must mean 21st c", we are going to get to 2036 and suddenly every single document date that ends in YY will start defaulting to 1936.
So, the question i have for ppl who know more about software than me is: how fucked are we? is dynamically referencing the computer date actually way more complex and resource intensive than i think it is? is this normal practice? as i said, this is not the only thing shady about this vendor and this software, but it is one of the most glaring to me, someone with a single semester of programming and data management under my belt + a time spent programming as a hobby during lockdown.
