Hallo guys,
In the last 2 weeks, I did a serious research on pizzagate. I have collected the data on all missing children from ncmec website (missingkids.com) and put it in a database and ran some statistical research on Virginia and other states. All the project files including the database can be found on my github repository
Sadly the research files are in german, thats why i will give you a short overview, but you can find them here on my homepage if you are interested, its around 30 pages with many maps.
The direct link for the file is here.
Sorry for my english. :-)
The results:
The data for the last 3 years cant be analyzed, because there are to many false positive results (to many children yet to be found, which have no connection to pizzagate) - it was shown that we can't use the last 3 years for this reason. So I used the timeframe from 1984 - 2013 (30 years).
Its important to notice, that most data is based on "missing childs per capita" so it doesn't matter, how many citizens live in a certain US state or county.
First of all, I just calculated the number of missing children from the database for the last 30 years, and Virginia is just number 13 out of 50 states + DC. Maryland was number 7. I have plotted the data on a USA map and somehow Virginia and Maryland were both suspicious. Its because all states in east USA were in a better shape than Virginia and Maryland. You can see the map in a file I posted above.
Thats why I started to watch the changes among the timeframe and I have found, that between 1984-2005, Virginia used to be number 37 - it means one of the safest US States for missing children (it was very unlikely missing kinds couldn't be found - in whatever condition) - Virginia was very similar to all other states in east USA. It turns around since 2006, since then its number 1 countrywide - I was shocked.
I have calulated the "missing childs per capita changes" among all states and calculated how likely it is, for such a change like it occurred in Virginia. The odds are 0,014%, its around 3,4 sigma event (for statisticians) among all states and its extremely unlikely being a coincidence.
Further I have compared Virginia just among the east coast, because its easy to see, there are a east/west and some north/south differences on "missing child per capita" for all timeframes. The results for just the east USA were about the same, 0,015% for being just a coincidence.
The way I have calculated the numbers, you cant explain them as population change, or crime rate, crime rate change, or wealth change or whatever....Further I have calculated the "overhead" for Virginia, and the number is 21 children between 2006-2013. They are likely to be victims of pedophiles.
Then I also compared the US counties, and Fairfax is nr. 1 county wide among all counties with more than 500000 inhabitants. Baltimore is suspicious too. It is possible that there were some pedophile switch from Baltimore to Fairfax county between 2006-2013.
There are some more calculations, but they are not as important. Much more explanations can be found in the pdf file as stated above. Its important to realize, that imported children, like from Haiti would not be visible among these results since they would not be missing in USA.
bye
view the rest of the comments →
connornm777 ago
Good stuff. I've been busy with studies lately but have wanted to do something more technical like this. I'm stealing your data and code, and I'll try to independently verify your results. I'll pm you if I get around to it soon.
Do you happen to have a write-up for how you got that calculated that confidence interval? Did you just calculate a mean and standard deviation assuming a normal distribution from pre-2006, and use that to claim post-2006 is a 3-4 sigma event?
Pizza_agent ago
Hey,
no, the results would be messed up if you do it this way. The calculation was more complex. There are 2 ways of doing it.
1) The goal is to get the "changes" for each state between pre-2006 and post-2006 and compare the changes to each other. As I said above, I have used the "number of missing childs per capita" - I called it "Density". Well, it means you get the number of childs from the database for the pre-2006 timerange (22 years), then you get the number for the post-2006. You do it for each state ofcourse, you can achieve this with 2 SQL Queries. Now you have to caculate the "density", you will need the number of habitants for each state. I have imported the data from the cited source, the data was from 2015.
And now you might see the 1. problem, the population did grow between pre-2006 and post-2006, you can't use just one value for both time frames, or the "density" would be wrong. The difference between the time ranges is about 16,8%, so you have to correct the data by this value when you calculate the density.
The 2. problem is, the time frames are not of equal length. The pre-2006 time frame is 2,75 times longer than post-2006 and its obvious the number would be higher by this factor, so you have to take it into account and correct the data as well.
And there is a 3rd problem, the post-2006 time frame contains more "yet to be found" children than the other time frame, because of the less time passed. The long term numbers for each year suggest, there is around 30% of missing children yet to be found (in post-2006) in the next years, I disucssed this topic in my paper. Thats why you multiply the post-2006 density by 0,7. You can play with this number, even 0,5 would still be a >3 sigma result at the end...
Now you got comparable densities for both time frames and you just substract them and then you compare the states to each other, caculate the mean value, standard deviation and the Virginia offset to the mean value. By the way, the possible countrywide increase of "missing children" would not affect your calculations since it would take place in every state.
2) A different way to get the results, is to compare not the "density changes", but just the density of each state. Here you don't need to correct the data as mentioned above. You just calculate the post-2006 density , and compare the states, you don't need pre-2006 data. To improve the results you should just compare the states in east USA only and not the whole USA, that is what I did. The reason is, there is a clear difference between east and west USA. In general, west USA has more missing children per capita, its easy to see on a colored USA map as I used in my paper. I dont know the reason, but it could be some ethical or racial differences between the states, I dont know. But its pretty clear, its the best if you compare similar states to Virginia, like New Jersey etc...I compared the whole east USA because I didn't want to be accused of "cherry picking". And if you look at the east USA, all the states but Virginia, are pretty equal. I excluded the states below 2000000 inhabitants, since the time frame is to short and you get to much noise.. ... As expected, you dont get here the same result as in 1) but you still get an >3 sigma event.
As you can see, there are pros and cons in 1) and 2).... In 1) the general number of missing childs doesnt matter, thats why you dont really need any east/west comparison like in 2). Further, the distribution should be better and you are able to spot some time related events... In 2) you dont need to correct the data and its easier to calculate the results and you are able to to see none time related events.... Overall, in this case I somehow prefer 1) over 2)