Data Analysis Tool – A How To Guide – Lesson 1, Part 2

By | 25th August 2018

I think that here we had better start with some terminology so that when I use an expression we all know what I am talking about.

So, here we go with…

Part 1 – Files, Folders and Things That Go Zip in the Night

If you are clear as to what Files, Folders and the Zip File/Folder are then please scroll all the way to Part 2.

I think that it’s fair to say that we all have a good idea of what a file is. Files are things that are like spreadsheets, documents and the like. They come in all sorts of shapes and sizes but at the end of the day a file is an application or contains data which is used by an application.

The DA Tool is a file, for example. It’s an Excel file and there’s a pile of other files which it needs to operate: there’s the two data files which which contain information which is needed by the DA Tool and then there’s all the input data files which is reads and works through: these will be the files which you download from the UK Horse Racing site.

So, I think that, loosely speaking, one could say that a file is an application or data that the application needs to run or processes. Any output is often placed into a file, or a series of files.

I think that’s the best that I can do without confusing people much more.

Now, these files live somewhere on the computer. Each file must live in a folder.

Imagine a tree, and at the bottom of the tree there’s the root. That’s the base of the tree. The root is the first folder in the tree.

Now any folder can contain any number of files and/or folders. So this means that if one looks in a folder then there could be a folder in there and then one in that folder and further folder in that one and so on, ad infinitum. Like a set of Russian dolls.

So the root folder can, and certainly shall, contain a number of different folders. One folder, Windows, is where the operating system keeps its stuff and this is one place where we shouldn’t go unprepared because messing around with files and folders in C:\Windows could well really bork one’s system.

These files and folders is what the File Explorer application can show us. We’ve learned how to create a folder called ‘UKHR’ which lives directly under the root of the C: drive (there will be no A: or B: drives in the 21 century, thanks to a diseased naming convention that Microsoft should have sorted out about thirty years ago, but let’s not go there right now other than to say that more often than not the main drive is the C: drive and everything ‘grows’ up from the root there).

We can go ‘down’ from the root folder to the UKHR folder which we created last time, and then we can see that we can go ‘down’ further into one of the other three folders that we have also made.

Note that we always go ‘down’ further from the root and ‘up’ towards the root. This is how we describe folder structures. So imagining this as a tree is a little tricky unless one starts to think of trees growing in Australia which, as all skoolboys kno, grow upside down…

Nonetheless, the common terminology for going through the file system is that ‘upwards’ is always towards the root and ‘downwards’ is always further away from the root.

But it is important to think of the structure of these folders as a tree as they all branch out more and more – the further and further one goes away from the root.

Back now to what we are doing and we have a number of folders underneath the UKHR folder, and it’s time to put some files in there. What we will do is to put the DA Tool and its associated files into the following folders like this:

But, first, we need to consider this strangeness that is the Zip folder.

If I wanted to send you a collection of files and folders then I could just simply send you the collection of files via email and then tell you into which folder I want each file to go into.

Not only is this going to be error-prone if I send you dozens of files, all having to live in specific folders, but it’ll be impossible for me to describe what I want to go where and then equally impossible for you to decipher my instructions and to put each file into the correct folder.

So, wouldn’t it be lovely if I could roll the whole lot up into one bundle (or zip it into a bundle) and then send the bundle to you for you to unpack (or unzip) and then, lo and behold, you have the right folders with the correct files in each.

This then, is the Zip Folder. Those who have been playing with computers for decades (easily spotted by their world weary brow beaten demeanour and, often, a long grey beard) will refer to these as ‘Zip Files’. This is because the end result is a file (with the extension of ‘.zip’) that one can send as one single file.

However, the younger whipper-snappers will, rightly, see this resultant object as a folder as it contains other folders and files all neatly zipped up and they will refer to it as a ‘Zip Folder’, or ‘Zipped Folder’.

These are the same thing and the name is interchangeable. And neither is more correct than the other (except that I will often call them ‘Zip Files’, so for the purpose of all this: this is more equally correct than the other names, except when I call them Zip Folders and then I am also right).

So the Zip File is really a folder.

The reason why we have to consider this is that the Zip File is a convenient way of getting data down from the internet. It’s easier to download a Zip File and to copy out the contents than by any other means.

But, as ever, this is Windows and this means that there’s a big But.

The first rule of thumb when it comes to Zip Files is that users should never use them directly, they should always copy stuff out of the Zip File into the file folders because, despite the ease of being able to see the contents of Zip Files within File Explorer, files within a Zipped Folder often can’t be used.

This is where Windows does its usual trick of doing half of something really good and the other half badly. In today’s Windows we can look easily inside a Zipped Folder as if it were a regular folder, but that’s all we can often do. We can’t always edit or handle some of the files within the folder. It would have been fantastic if we were able to do so, but as we can’t we will always have to remember that the contents of the Zipped File have to be copied out to somewhere on our hard drives.

If we don’t do that then things start to fall over. So this is a golden rule: never work with Zipped Files apart from using them to roll up files and folders in order to transfer them to someone else or to another computer. We always have to make sure that we copy stuff out of the Zipped File first.

It therefore becomes necessary for us to be sure that we can differentiate between a Zipped Folder and a normal folder.

Now, you may have wondered why I have made such a fuss in the last lesson about having extensions to files: this is one of the reasons why.

Imagine a folder containing two items: the first is another folder and the second item is a Zipped folder. This is what it looks like without the file extensions being shown.

At first glance one can see that there’s two folders called ‘AFolder’ and it’s not easy to see that one of them is actually a Zip File.

Let’s look at this again but with the extensions enabled.

Here there’s no danger of missing the information that is a zipped file.

This is one (one of many) good reason why extensions should be made visible to the user within File Explorer.

Time now to welcome back those who have skipped over this first section.

Part 2 – Downloading the DA Tool and Other Stuff

Right, let’s download some files.

First of all let’s go here:

The first link on the page should say “Data Analysis Tool (Version XX.X)”, click on that and then the following dialog will appear. This is how it appears in my Firefox browser.

If you select the ‘Open With’ option above before pressing OK then it makes like a little simpler if you’re not sure about your machine’s file structure. So, unless you’re happy going digging into the Download folder of your machine and looking for the .zip file then select the ‘Open With’ option.

When the file has been downloaded a new instance of File Explorer (what they call Windows Explorer on my machine, it seems) the it will show the contents of the .zip file.

What you then do is to select the DataAnalysis file, which should have the extension of .xlsm and then right click on the file name…

…and then press Copy.

The file is now in the machine’s Clipboard, which is a temporary storage copying/cutting & pasting facility, and then all one now does is to find the DAtool folder which is under the C:\UKHR folder. Once you’ve found the DAtool folder click on it in File Explorer so that the contents of the folder (which should be empty!) is shown on the right hand panel of the File Explorer.

Then in the empty space, right click your mouse again and select Paste. The DA Tool file will then appear in the folder.

Okay, do the same for the two Steerfiles from the same web page and also put these into the same C:\UKHR\DAtool folder.

You have enough now to start the DA Tool. However, it won’t run because of the diseased Microsoft Security (I’ll come to that in a bit) and also there’s no input data to analyse.

As you may know the DA Tool uses the monthly CSV files, the ones with the results in, and you need to download these. So, make sure that you have logged into the Members Area of UKHR (it’s best to do this in another tab of your browser) and then follow the third link to get to the CSV Results data.

In a similar way to the above steps download as many of these files as you like, but this time, pop them into the DAtoolData folder.

This is important: Please don’t put anything other than the CSV Results file into this DAtoolData folder. Put any other CSV file in there and the DA Tool will crash.

Going back to the two steer files, the two files with the .txt suffix: if you open them up with a text editor then you will see that these have been filled with data which gives the folders of where the output of the DA Tool goes and where it will look for the data to come in. So if you ever decide to move the DA Tool and files and folder to another part of the computer then don’t forget to modify these.

Now, the last thing is to sort out the Microsoft Diseased Security Settings.

Part 3 – Sorting out the Diseased Microsoft Security Settings

This is one of the many Microsoft induced banes of my life. It will be a bane of yours too. See, I am a sharing type of bloke and I am happily sharing this with you.

Many years ago there were two viruses that went around called the ILuvYou virus and the Melissa virus. Both of these were distributed by Word macro code (or VBA to give it its proper name) and was opened up by people who should have known better but managed to infect their computers and it sent out the same virus to the next person who should have known better. In hours clueless middle managers around the world were responsible for the reproduction of this virus until it was finally stomped out. Alas, the clueless middle managers remained unstomped.

The finger of blame was pointed at Microsoft even though they really weren’t to blame (this is one thing that I am not blaming them for), so they went into full knee-jerk response mode and decided that the VBA (that’s the code part of the Office applications, of which Excel is one) needed to be locked down so tight that it couldn’t run unless the user wants it to (I am, however, blaming them for this).

So, we will unscrew the shackles a bit so that we can run the DA Tool.

Here we go, it’s dead simple. First open up Excel and make sure that you can see a spreadsheet in front of you. If you can’t see a spreadsheet, empty or otherwise, then create a new spreadsheet.

Your version of Excel may differ to mine but on mine (Excel 2007, my favourite version) there is a big round thing in the top left hand corner which is actually a button. Click on that and a dialog box pops up and you ought to see a button marked ‘Excel Options’. If you can’t find that then please let me know.

We’re now looking at the Trust Center item in the left hand pane of the dialog box. Click on that and you will see a button marked ‘Trust Center Settings…’ click on that and yet another dialog box opens.

Click on the menu item marked ‘Macro Settings’ and then there’s two areas in the right hand pane. The top one is a marked ‘Macro Settings’ and is filled with four ‘radio buttons’. Select the last one which is marked “Enable all macros (not recommended; potentially dangerous code can run)” Seriously, it does really say that.

This does sound like something right out of The Hitch Hiker’s Guide, doesn’t it? What it should really say is “Enable all macros so that you can bloody well get on with some work, but our twitchy legal department has advised us to put this alternate fearsome message up instead”.

Then in the bottom half of the window there’s a box which asks us to trust access to the VBA project object model. Make sure that this is checked.

Almost there.

In the left hand menu list there’s an item called ‘Trusted Locations’. Go in there.

In the Trusted Locations it only allows code that you’ve allowed to run in the previous step to run provided the code is in a certain list of folders. You can now start to appreciate the amount of jerked knees in the Microsoft legal division; so much that it’s a surprise that their collective spines didn’t snap in two.

There’s a button named ‘Add New Location…’. Click it.

Add a new location as shown in the illustration below.

You can now run Excel VBA code from anywhere within the C:\UKHR folder and folders under there.

Part 4. – Starting the DA Tool

Go on, open up the DA Tool. All you need to do is to go into the folder where the DA Tool lives, C:\UKHR\DAtool and then click on the Excel file in there.

Providing that you have set everything up right you will see the following screen.

Yes, a lot of effort to get here. But that’s the hardest part over with. And, in doing so, you have improved the set up of your machine significantly.

Right, have a look at what I have indicated in the image above. Yours should look a little like this (without the arrows, boxes and things, of course). If it doesn’t then shout.

If it looks like the above then we know that VBA has been enabled correctly and then you’re ready to do your first run.

If what you see doesn’t look like that then there’s an error. For example, this following example shows that the Output Folder cannot be found. This could be because, for example, the folder name was mistyped or that the folder was in the wrong place.

Part 5 – Your First Run

See that big button in the top left hand corner marked ‘Run’?

Well, go on then. Follow its instruction and click on it.

And then when it’s finished (the screen will look different) look into the Output folder for a text file and we’ll come to that in the next lesson as we’ve done enough for now and, besides, I think that I deserve a very large gin with no tonic in it. Whilst I am busting the cork see if you can find the output report file and have a look at it in Notepad or something.