QA in Everyday Data
08/22/2020 – Update: The data problem is still occurring. I will run a simple SQL statement next, to fix the corrupt data that was made available by NASDAQ.
07/28/2020 – Update: This problem is still occurring.
07/10/2020 – Update: This problem is still occurring.
03/28/2020 – This week, I discovered a data issue in the files for Short Trading Halts provided by NASDAQ. Most companies don’t make it easy for the public/non-customers to bring problems to their attention or open tickets, but sometimes Twitter can be a good last resort.
The format of file shorthaltsyyyymmdd.txt consists of 4 fields/columns:
Examining the row for Stock Symbol PLAY below, we can see that the ending quote for Dave & Buster’s Entertainment has been written after the comma, and has created a fifth data column/field.
This causes data load problems for any Vendors processing this file.
ANDE,"The Andersons, Inc.",Q,3/25/2020 3:50:00 PM
PLAY,"Dave & Buster's Entertainment
PHAS,PhaseBio Pharma Common Stk,Q,3/25/2020 3:50:29 PM
ANIP,"ANI Pharmaceuticals, Inc.",Q,3/25/2020 3:50:33 PM
When loaded into a MySQL Table based on the column assumptions provided by NASDAQ, this is the problem that occurs:
Whereas, these are examples of good entries that NASDAQ provides 99% of the time: