Data-driven stories to monitor the Government's strategy in response to the pandemic
Given the context of a pandemic and the lack of culture of transparency in Argentina, we proposed to monitor the sanitary strategy and evaluate que quality of public data. This was a hard task because there were many government sources with information the data of which do not agree. For several months, we did not have any open file format or real-time information of the pandemic.
Therefore, we made more than ten requests for access to information (with a minimum delay of twenty days) to access these data, vital for the follow-up of the pandemic and the Government did not give us that access. This situation gave rise to more than twenty articles which evidenced the lack of tests, the precarious epidemiologic surveillance system and several problems to give real-time answers.
The most outstanding pieces
More than 5000 cases about pandemic disappear from official records
The official database with the number of Covid-19 cases registered in Argentina was questioned by different sectors due to multiple factors: delays of up to more than a month in the report of infected or deceased persons, problems in uploading, among others. So much so that the prestigious Oxford University metrics site, Our World in Data, deleted Argentina from its test tracking.
Two of the most relevant anomalies that we detected in this research were records that disappear and cases that change to inconsistent status, i.e., confirmed cases that become "discarded" or "suspicious". These inconsistencies, which so far have had no official explanation, make it difficult to analyze the course of the pandemic and, at the end of each day, have an impact on the total number of infected people that are accumulating. These figures are nothing more and nothing less than the figures with which Argentines and the world can follow the course of the contagions.
A number of 254.454.234 records collected from all the files published by the Ministry of Health between June 23 and November 28 were filed in BigQuery on a same database created for further analysis. Data analysis was made from queries using Window functions since it allowed us the access to data in the records before and after the current registry.
Click here to read the full article.
In June, only half of the laboratories informed by the Government processed samples
While the Government reported that more than 300 public and private laboratories in the country were analyzing patient samples to confirm or discard the diagnosis of Covid-19, official records showed that the actual number of facilities informing processing of these results was much lower: only 177. This gap between official data and announcements was evidenced in a context in which there was a delay of up to 20 days in the processing of samples.
This research was carried out based on a request for access to public information to the national Ministry of Health requesting details of the number of laboratories that informed sample processing on a daily basis. After accessing this information, a clean-up of the base received was carried out and a subsequent analysis of the different announcements made by the Government. Only once the government was left behind regarding the actual number of laboratories that were diagnosing Covid-19, at other times the Government reported a number significantly higher than the official records showed.
Click here to read the full article.
Coronavirus in Argentina: almost half of the tests were made in the last 20 days in the Greater Buenos Aires area and they admitted they are to the limit.
While it was questioned whether the number of tests was adequate, but there was no coordinated official data, LA NACION analyzed and processed the dataset of COVID-19 cases and made a detailed analysis of the tests in the urban centers most affected by the virus at that time. These are the province of Buenos Aires and the City of Buenos Aires, two adjacent jurisdictions governed, one by the ruling party and the other by the opposition. The article shows that the average number of daily tests for Covid-19 doubled in the Greater Buenos Aires area and that in the 20 days prior to the publication of the article, half of the total number of diagnoses reported since last March were carried out.
The public database of Covid cases was used for this article. The analysis was performed in Excel based on clusters arranged in the locality of residence of the patients and on the estimation of the number of daily tests per locality.
Click here to read the full article.
In proportion, the City of Buenos Aires tests more than twice as much as in the Greater Buenos Aires area
This article from LA NACION shows the difference in the testing capacity in two of the most important jurisdictions of the country which, in turn, are governed by opposite political signs. While by mid-May, an average of 192 tests per 100,000 inhabitants had been performed in the the Province of Buenos Aires, this figure amounted to 479 PCR tests per 100,000 inhabitants in the City of Buenos Aires (governed by the national opposition party) .
The Covid public database of cases was used for this article. The analysis was carried out in Excel based on clusters in the locality of residence of the patients, estimate the number of daily tests per locality, their positivity and, combined with the population data published by INDEC, it was estimated the rate of diagnoses per 100,000 inhabitants.
Click here to read the full article.
Coronavirus: what are the most common symptoms of the Argentinians infected?
Based on two requests for access to public information to the Ministry of Health, LA NACION accessed the database of symptoms, diagnosis, type of infection and use of intensive care units of the first 40,647 Argentinians and residents of 17 other countries considered suspicious for Covid-19 between January 31 and April 21.
Through an analysis of the data provided, a journalistic article revealed the most frequent symptoms presented by those patients who had the disease.
The analysis of these records forced the National Government to change the "suspected case" definition and add other symptoms appearing from the analysis presented by LA NACION, since four of the eight most frequent symptoms were not included in the definition of the Ministry of Health of the Nation.
The data for this article were obtained from two requests for access to public information since it is not open until today. When we received them, they were not standardized: duplicated and unstructured records and symptoms, among others. For this reason, we had to start cleaning them up and use the OpenRefine tool together with manual tasks in Excel.
Once this process was completed, the data were structured on an Excel matrix to determine frequency of occurrence of symptoms shown in the public records and contrast them with those required by the Government for a patient to be tested.
Click here to read the full article.
Coronavirus in Argentina. Uncertain data once again: more than 108.000 suspected cases without definition
LA NACION found that there were more than 108.000 cases informed between February 28 and mid-October (a month before the publication of the article) that remained without diagnose (instead of being confirmed or discarded), although there was enough time to define said cases.
The analysis was cut off one month before publication considering that it was feasible that data might not be updated due to the recent occurrence of deceases. It was concluded that 8% of more than 3.1 million consultations were attended at a health center in the country for suspicion of Covid-19 to date. Among this total of suspected cases pending definition to date, there were 108,143 (47%) without diagnosis from February 28 to October 10, the last day of the interval taken into account for this analysis.
Once again, this article showed that the official database did not have quality information to analyze the progress of the pandemic in Argentina.
Click here to read the full article.
Coronavirus in Argentina. Another difference in official data: how many patients died without hospitalization?
From a complex analysis of the official database, LA NACION found that more than a third of the 20,795 deceases known so far had died outside the hospital and, in six provinces, that proportion was even higher: more than half of the deceases reported did not register hospitalization or received critical care in Jujuy, Santiago del Estero, La Rioja, Tucumán, Córdoba and Río Negro. However, this alarming data did not reflect what was really happening in the country and led us to find a problem in the data uploading that the Government had not recognized. In fact, what was happening was not that people died outside the hospital and did not receive the corresponding medical care, but that there was a delay in the entry of data on the hospitalization of those who had already died.
In a context where it was essential to know what and where Argentinians were dying of, the finding of this gap only made it clear once again that the information we had lacked for quality.
The file containing COVID cases was analyzed in SQL for this piece, from the clusters made based on the residence of patients. Only those persons deceased, and the date of hospitalization were taken into account.
Click here to read the full article.