Interactive visualization of WordPress blog view statistics.

As a follow-up to our episode on ParaViewWeb, we present visualization of the WordPress statistics for episode page views of our blog. While most listeners subscribe to our RSS feed, the blog page views provide a glimpse into the time course of episode attention.

Hat tip goes to Anthony Scopatz for tracking down a method to grab the WordPress stats. For a summary of the instructions, wget the URL http://stats.wordpress.com/csv.php:

wget -O instructions.txt http://stats.wordpress.com/csv.php

To use the API, you’ll have to sign up for a free Akismet key. The following request was used to get the raw site post views (API key has been replaced):

wget -O stats.csv 'http://stats.wordpress.com/csv.php?api_key=0123456789abc&blog_uri=http://inscight.org&blog_id=0&table=postviews&days=-1'

Our raw data set looks something like this:

"2011-02-19",0,"Home page","http://inscightpodcast.wordpress.com/",37
"2011-02-19",1,"Episode 0: Strata Con & Big Data","http://inscight.org/2011/02/16/episode_0/",23
"2011-02-19",13,"Bio","http://inscight.org/bio/",1
"2011-02-18",0,"Home page","http://inscightpodcast.wordpress.com/",49

Next, the POP (power of Python) is applied to filter out the episode data sets and sort views by day counting from the inaugural day of the blog. To get a more continuous approximation of post attention, we use a kernel density estimation technique (note that in this case it is the same as convolution with a Gaussion, but we use a rule for determining the sigma). Finally, we export the dataset as a VTK image.

#!/usr/bin/env python

import re

from matplotlib.mlab import csv2rec, rec_drop_fields

import numpy as np

from scipy.stats import gaussian_kde

import vtk


# Get our data
stats_filename = 'stats.csv'
stats = csv2rec(stats_filename)


# Only look at episode posts
episode_post_ids = dict()
episode_title_re = re.compile(r'^[Ee]pisode [0-9]+:')
for rec in stats:
    if episode_title_re.search(rec['post_title']):
        episode_post_ids[rec['post_id']] = rec['post_title']

# Sort by chronological order
post_ids = [id for id in episode_post_ids.iterkeys()]
post_ids.sort()

# Put the views per day in a numpy array indexed by episode
start_day = stats[-1]['date'].toordinal()
end_day   = stats[0]['date'].toordinal()
days = end_day - start_day + 1
posts = len(post_ids)
post_views = np.zeros((len(post_ids), days))

# To the day-binned data into an array do the following
#for rec in stats:
    #if rec['post_id'] in post_ids:
        #day_idx = rec['date'].toordinal() - start_day
        #post_id_idx = post_ids.index(rec['post_id'])
        #views = rec['views']
        #post_views[post_id_idx, day_idx] = views

# Gaussian Kernel Density Estimation to smooth the result
post_view_samples = [[] for i in range(posts)]
for rec in stats:
    if rec['post_id'] in post_ids:
        day_idx = rec['date'].toordinal() - start_day
        post_id_idx = post_ids.index(rec['post_id'])
        views = rec['views']
        for ii in range(views):
            post_view_samples[post_id_idx].append(day_idx)

positions = np.arange(days)
for episode in range(len(post_views)):
    episode_views = np.array(post_view_samples[episode], dtype=np.float64)
    kernel = gaussian_kde(episode_views)
    smoothed_views = kernel.evaluate(positions) * len(episode_views)
    post_views[episode,:] = smoothed_views

# Export to a VTK image for analysis with ParaView.
image_importer = vtk.vtkImageImport()
post_views_str = post_views.tostring()
image_importer.CopyImportVoidPointer(post_views_str, len(post_views_str))
image_importer.SetDataScalarTypeToDouble()
image_importer.SetNumberOfScalarComponents(1)
image_importer.SetDataExtent(0, post_views.shape[1] - 1,
        0, post_views.shape[0] - 1,
        0, 0)
image_importer.SetWholeExtent(0, post_views.shape[1] - 1,
        0, post_views.shape[0] - 1,
        0, 0)
image_importer.SetDataSpacing(1.0/7.0, 1.0, 1.0)
image_importer.Update()

writer = vtk.vtkStructuredPointsWriter()
writer.SetInputConnection(image_importer.GetOutputPort())
writer.SetFileName('post_views.vtk')
writer.Update()

Our eyes and visual cortex have an easier time detecting detailed variations in a scalar field when represented as a plot as opposed to a colormap.

Insight views. Time progresses from left to right and episode number is in the vertical direction.

Instead of a a line plot for 1D datasets, we can warp the height of grid by its scalar value for a 2D dataset to see finer details in the intensity patterns of an image. In ParaView, this is a simple “File -> Open”, “Filters -> Alphabetical -> Clean To Grid”, “Filters -> Alphabetical -> Warp By Scalar”. Saving a ParaView state file makes it easy to import the result into ParaViewWeb!

Click here for an interactive 3D representation of the dataset!

Filed under: General Interest

Interactive visualization of WordPress blog view statistics.

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...