I've added these for an article I recently wrote for my website. Their behaviour is limited but it's a start.
Friday, 17 April 2015
D3 Tooltips for a line chart
I've added these for an article I recently wrote for my website. Their behaviour is limited but it's a start.
Thursday, 16 April 2015
Ofcom Broadband Report 2013 Analysis
It basically just looks at the data contained within the Ofcom 2013 broadband report with charts and maps created using D3.
I'll create some posts to explain how I did it later. Hope you enjoy!
Monday, 9 March 2015
Petrol Price Exploration
Introduction
- Petrol prices - ONS
- Brent crude oil prices - US Energy Information Administration
- Dollar to Pound conversion rates - Bank of England
The Brent crude spot prices are a daily series although there are gaps in the series for holidays etc. The data goes back to 1987.
The dollar to pound conversion rate is again a daily series and again there are gaps on the holidays. The gaps are similar to the gaps found in the Brent crude prices but are not always the same - different country, different holidays.
It's interesting to note that the drop in price in 2008 was much larger than the recent drop. The two series are show similar trends but they're not currently directly comparable. Let's see if we can improve on that.
Manipulation
Correlation
The plot looks indicative of a strong relationship between the two prices. However note that there are very few points towards the bottom left. This is the recent drop in prices. Taking these out we get:
Further Analysis
- Are the peaks in the two series closer together than the troughs,
- Are the positive gradients of the pump prices more steep than the negative gradients.
Wednesday, 4 February 2015
Extracting data from text files using Java
This post links to four articles on how to read data line by line from a simple text file. The four articles are all very similar but detail different methods of accessing the data. They all consist of five steps:
- Defining the file
- Opening the file
- Reading in the data
- Doing something with the data
- Outputting the result
- Extract data from a simple text file
- Extract data from gzipped text files - useful if the individual files are large. The biggest bottleneck is disc access times, especially HDDs.
- Extract data from zipped text files (zip archives)
- Extract data from a directory of text files - it's often the case that you want to analyse data from a whole raft of files.
Wednesday, 15 October 2014
New rim weighting Excel macro
So the input looks like this:
Clicking on the weight by order (new) button brings up the dialog:
Thursday, 14 August 2014
Kernel Smoothing Macro for Excel
Introduction
I won't go into the mathematical detail here as Wikipedia has several pages that describe the method in a far better way than I ever could. All I will say is that the method is relatively simple. All we're doing is calculating a weighted average of each point based on every other single point. There are two pages that are helpful:
- Kernel Smoother - describes the kernel smoother and how it works
- Kernel statistics - describes the various kernels that can be used
The Macro
- The y-coordinates of the data series
- The x-coordinates of the data series
- The kernel to use
- The scale parameter of the kernel
The kernel is the function used to calculate the weight. There are 8 kernel functions to choose from:
- Gaussian
- Uniform
- Triangular
- Epanechnikov
- Quartic
- Cubic
- Logistic
- Cosine
The scale parameter determines the 'width' of the function i.e. how many neighbouring points are used to calculate the new value of each point. The scale can either be a reference to a cell containing a number, a range or a number. If the scale is a range then it has to be the same size as the range of the y-coordinates if it's size is greater than 1 cell. Also the scales are used by applying scale 1 to point 1, scale 2 to point 2, etc. for every point.
The Code
Example
This will also show the shortcomings of the method and one way of surmounting them (although not entirely).
Sunday, 8 June 2014
UK TV Listings in Excel
UK TV Listings in Excel
This article describes a simple way of getting TV programme listings from the Internet and into Excel using VBA.
For this I'm going to use the XMLTV feed provided by the Radio Times. This feed is usable only for personal and non-commercial use and is copyrighted. It is useful for analysing the TV guide though.
Method
The first thing to do is create a new module to hold the code. Let's call this TVGuide
. To hold all the data we need to create a new workbook and worksheet. Then we can set up an external data query from a web page to get the channel data. I used the macro recorder to do this part which is why it's a bit verbose. I'm not sure how much of the code I need but it doesn't do any harm leaving it there.
Sub GetChannels()
'
' Get Channels from Radio Times
'
Dim wb As Workbook
Dim wk As Worksheet
Set wb = Application.Workbooks.Add
Set wk = ActiveWorkbook.Worksheets.Add
wk.Name = "Channels"
With ActiveSheet.QueryTables.Add(Connection:="URL;http://xmltv.radiotimes.com/xmltv/channels.dat", Destination:=Range("$A$1"))
.Name = "channels"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End Sub
You'll note that the address of the feed is http://xmltv.radiotimes.com/xmltv/channels.dat
. This page contains a list of the channel numbers with their descriptions separated by a pipe character. So the next job is to separate these into two columns.
'Split data by pipe
Columns("A:A").Select
Selection.TextToColumns Destination:=Range("A1"), DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=False, _
Semicolon:=False, Comma:=False, Space:=False, Other:=True, OtherChar _
:="|", FieldInfo:=Array(Array(1, 1), Array(2, 1)), TrailingMinusNumbers:=True
'Delete row 2
Rows("2:2").Select
Selection.Delete Shift:=xlUp
'Label columns
Range("A1").FormulaR1C1 = "Channel Number"
Range("B1").FormulaR1C1 = "Channel"
'Change widths of columns
Columns("A:A").EntireColumn.AutoFit
Columns("B:B").EntireColumn.AutoFit
'Sort data
wk.Sort.SortFields.Clear
wk.Sort.SortFields.Add Key:=Range("B:B"), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
With wk.Sort
.SetRange Range("A1").CurrentRegion
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
The above code needs to be inserted before the End Sub
of the previous code example. I've also labelled the columns, deleted a row, changed the widths of the columns and sorted the data by channel description. This makes it easier to find the channel you're interested in.
So, that's nice. You have a list of channels sorted by their description. It's not very useful though. Fortunately, it's also easy to get the actual TV guide data for each channel. The address is almost exactly the same as before but instead of channel.dat
at the end of the address we have {channel number}.dat
where {channel number}
is the number from the previous list.
Now all we need is some way to use the first list to get the actual time data. The easiest way (well, I thought it was easy) is to create a class module and enable events on the workbook that you've created.
So let's create that new class module and call it ChannelSelector
. We want to allow events within the workbook we've created. Therefore we need to create a variable for the workbook in question and use the WithEvents
keyword. We also need to attach this newly defined variable to the workbook that was created earlier.
Option Explicit
Private WithEvents App As Workbook
Private Sub Class_Initialize()
Set App = ActiveWorkbook
End Sub
The above five lines are almost all that's needed to do this. We also need to create a variable within the TVGuide
module for this class
i.e. instantiate an object of this class. So:
Option Explicit
Public ChannelHandler As ChannelSelector
defines the variable in the module. We then 'set' this variable to the class defined above. This is done at the end of the GetChannels()
subroutine.
'Set events for channel numbers
Set ChannelHandler = New ChannelSelector
Now that we've allowed user-defined events for this workbook, we need to create a subroutine to handle them. In this case we want to handle the double click event when the user double clicks on the channel number. Therefore we use the SheetBeforeDoubleClick
event. Below is the code for the event handler.
Private Sub App_SheetBeforeDoubleClick(ByVal Sh As Object, ByVal Target As Range, Cancel As Boolean)
Dim channel As String
Dim wk As Worksheet
'Cancel event
Cancel = True
channel = Target.Text
Set wk = ActiveWorkbook.Worksheets.Add
wk.Name = channel
With ActiveSheet.QueryTables.Add(Connection:="URL;http://xmltv.radiotimes.com/xmltv/" & channel & ".dat", Destination:=Range("$A$1"))
.Name = "chan"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
'Split data by tilde
Columns("A:A").Select
Selection.TextToColumns Destination:=Range("A1"), DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=False, _
Semicolon:=False, Comma:=False, Space:=False, Other:=True, OtherChar _
:="~", FieldInfo:=Array(Array(1, 1), Array(2, 1)), TrailingMinusNumbers:=True
End Sub
This uses almost exactly the same code as before to extract data from the XMLTV feed. This time however the data needs to be split by the tilde (~) character.
Conclusion
And there you have it, the times for the next two weeks for all the programs for the channel that you just double clicked.
Obviously this is just a simple example and there is no error checking whatsoever. If you double click on something other than a relevant channel number then the program will fail with a runtime error.
There's a lot that can be done to expand on this and produce a nice TV Guide. I just wanted to give a simple introduction though.