Monthly Archives: March 2015

Revenge Pornography – Analysing Week 2’s Data

I can imagine the regular readers of my blog are just dying to know what the analysis of week 2’s data shows.  Wait no more, as here is the latest figures featuring the data scraped from a revenge pornography Website.

Here is a summary of the data I collected in the second week.

Day Number of Posts Male Female Current Removed
8 14 1 13 10 4
9 29 1 28 18 11
10 6 0 6 5 1
11 9 1 8 8 1
12 24 4 20 13 11
13 2 0 2 1 1
14 28 0 28 14 14

First point is there is more men in the data set for the second week, but there are more posts in the second week than the first, 112 as opposed to 104.  This time 3 of the men had their posts removed before the 672 hours.

I made the data into charts.  This first chart shows the number of posts to the revenge pornography Website in the second week.

Chart 1

 

The second chart shows the number of posts to the revenge pornography Website in the second week split by gender.

Chart 2

 

Next, I created a chart showing how many posts stayed live or current for the full month (672 hours) that I scraped data.

Chart 3

The current posts were further analysed.   Firstly, they were split by gender and showed the maximum number of views that each post received in the month.

Chart 4

Chart 5

 

Again, the following charts are a mess, but useful for identifying trends.  The following two charts shows the number of cumulative views per hour for each posting with a status of current split by gender.

Chart 6

Chart 7

The next two charts show the number of actual views per hour for all current posts split by gender.

Chart 8Chart 9

As before, it would appear from this data that most views to the posts take place early on, within the first 50 hours.  However, further analysis will need to be done.

Advertisements

Revenge Pornography – Analysing Week 1’s Data

I have been analysing the first week of complete data that I have scraped from the revenge pornography Website.  The analysis I have performed has been pretty basic, mainly being used in order to familiarise myself with the data I have.

Here is a summary of the data I collected in the first week.

Day Number of Posts Male Female Current Removed
1 22 1 21 15 7
2 16 1 15 12 4
3 11 0 11 8 3
4 19 1 18 15 4
5 11 0 11 10 1
6 0 0 0 0 0
7 25 0 25 20 5

From table it is possible to create some charts of the data, something that makes it easier for me to understand the data.

This first chart shows the number of posts to the revenge pornography Website in the first week.

chart

 

The second chart shows the number of posts to the revenge pornography Website in the first week split by gender.

chart2

Next, I created a chart showing how many posts stayed live or current for the full month (672 hours) that I scraped data.

chart3

 

The removed posts were all of female subjects.  All the posts featuring males in the first week stayed current for the full month.

The current posts were further analysed.   Firstly, they were split by gender and showed the maximum number of views that each post received in the month.

chart4chart5

 

There is a large discrepancy between the number of views that male and female posts receive.

The next chart looks like a bit of a mess, but was necessary in order to see what was going on with the page views.  This chart shows the number of cumulative views per hour for each posting depicting a female and with a status of current.  As a result there are 77 data series, so it is a bit of a mess, but it is possible to identify a trend.

chart6The following chart is also a bit of mess, but also shows something interesting.  It shows the number of actual views per hour for all current female posts.

chart6 It would appear from this data that most views to the posts take place early on, within the first 50 hours.  However, further analysis will need to be done.

 

A Revenge Pornography Website – methodology and data.

My data collection is well under way and will soon be finished.  This is a really exciting stage where I can sit back and watch the data flood in.  My data is collected via a custom built web scraper written in python and using Selenium WebDriver.  Once the data has been scraped it is stored in a database.  The scraper is running once an hour and collects data new posts to a revenge pornography Website, data on how many views each post receives and how many comments are made about each post.  I collected data concerning new posts for 28 days and will collect data about each post for 28 days.  Therefore, I will potentially have data about the views and comments each post receives for 672 hours (28 days x 24 hours).  There will be some exceptions.  Sometimes posts are removed from the website, in these cases I will have data about the post up until the time it was removed.  There are also exceptions where despite the robustness of the Web scraper, it did not run and data was not collected.

Over the 28 days a total of 396 posts were added to the revenge pornography website.  There were a total of 2 days during this time that no posts were added.  Of these posts a total of 378 featured women with 18 displaying men.  This means 95% of the posts were of women.

Frighteningly; 396 posts, each with a potential for data being collected for 672 hours, means that there could be a maximum of 266112 lines of data in my database in one table alone.  Suddenly it seems as if I may drown in data.  Despite the large numbers, I am looking forward to what the data analysis will reveal.

The first few posts that I have looked at have revealed some interesting details.   The first post to the website in the data collection period remained live for the entire 672 hours.  During this time the scraper ran successfully 656 times, meaning 16 times the scraper did not collect data.  On the first run the scraper recorded that the post had received 2169 views and by the last run, 28 days later, it had received 19631 views.  50% of the total views received by the post occurred within the first 14 hours of the post going live.  75% occurred within 43 hours and 90% within 109 hours.

Whilst I have done this with the first few posts in my data set, it will take me a while to get them all done.  I envisage a number of hours ahead slaving over my keyboard, but at least I am not transcribing interviews!

Why Post to a Revenge Pornography Website?

Revenge Pornography as a term is a misnomer.  Some people do post intimate material of an ex partner as a means of taking revenge.  For instance, in the wake of an acrimonious relationship breakdown.  Comments made by those who post material in these circumstances often reference revenge and suggest that the person being exposed without their consent did something to warrant such behaviour.  Other people are posted for other reasons, such as not having a sexual relationship with the person they sent the material to.  Sometimes the poster finds the material (on a mobile device, on line or on a friends email) and decides to post it .  Reposts happen, where someone is reposted either by the original poster or by someone else who has accessed and saved the material.

Suggesting that posts are made purely out of a motivation for revenge is incorrect and does not help to understand the issue.  The new law recently passed in England and Wales makes it “an offence for a person to disclose a private sexual photography or film if the disclosure is made a) without the consent of an individual who appears in the photograph or film, and b) with the intention of causing that individual distress”.  The key word here is distress, the material has to be posted with the intention of causing the person who appears in it harm.  There is no requirement for a prior relationship to have been in place.

There are limitations with the law, firstly, as is the case with other local laws governing the Web, this will be a matter of jurisdiction.  Secondly, the law aims to deal with the poster of the material only, if they can be identified.  There is no provision to stop people from viewing or engaging with the material, only to stop them posting.   It may be technically impossible to stop people from viewing the material and the only hope is that people learn to be moral and considerate of others.