{"id":321,"date":"2015-11-08T15:59:41","date_gmt":"2015-11-08T13:59:41","guid":{"rendered":"http:\/\/tuijasonkkila.fi\/?p=321"},"modified":"2024-09-20T13:22:11","modified_gmt":"2024-09-20T10:22:11","slug":"about-43000-results","status":"publish","type":"post","link":"https:\/\/tuijasonkkila.fi\/?p=321","title":{"rendered":"About 43000 results"},"content":{"rendered":"<p>Since few days now, I&#8217;ve had my Google search archive with me. In my case, it&#8217;s a collection of 38 JSON files, containing search strings and timestamps. The oldest file dates back to mid-2006, which acts as a digital marriage certificate of us, me and the Internet giant.<\/p>\n<p><a href=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/files.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/files-300x276.png\" alt=\"JSON files of Google search archive\" class=\"alignnone size-medium wp-image-323\" width=\"300\" height=\"276\" srcset=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/files-300x276.png 300w, https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/files.png 615w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>It took no more than 15 minutes for Google to fulfill my wish to get the archive as a zipped file. For more information on How &amp; Where, see e.g. <a href=\"http:\/\/venturebeat.com\/2015\/04\/20\/google-now-lets-you-export-your-search-history\/\">Google now lets you export your search history<\/a>.<\/p>\n<p>Now, this whole archive business started when I was led to a very nice blog posting by <a href=\"https:\/\/lisacharlotterost.github.io\/2015\/06\/28\/TUTORIAL-Google-Search-History\/\">Lisa Charlotte Rost<br \/>\n<\/a>.<\/p>\n<p><a href=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/tweet.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/tweet-300x133.png\" alt=\"Tweet about Lisa Charlotte Rost\" class=\"alignnone size-medium wp-image-322\" width=\"300\" height=\"133\" srcset=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/tweet-300x133.png 300w, https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/tweet-624x276.png 624w, https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/tweet.png 654w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>I find it fascinating, what you can tell about a person just by looking at her searches. Or rather, what kind of narratives s\/he builds upon them; to publish all search strings verbatim is not really an option.<\/p>\n<p>Halfway in the 4-week course <a href=\"http:\/\/journalismcourses.org\/ID31015.html\">Intermediate D3 for Data Visualization<\/a>, the theme is stacked charts. Maybe I could visualize, on a timeline, as a stacked area chart, some aspects of my search activity. But what aspects? What sort of person am I as a searcher?<\/p>\n<p>Quite dull, I have to admit. No major or controversial hobbies, no burning desire to follow latest gadgets, only mildly hypocondriac, not much interest at all in self-help advisory. Wikipedia is probably my number one landing site. Very often I use Google simply as a text corpus, an evidence-based dictionary:&#8221;Has this English word\/idiom been used in the UK or did I just made it up, or misspelled?&#8221; Unlike Lisa, who tells in <a href=\"http:\/\/datastori.es\/visualizing-your-google-search-history-with-lisa-charlotte-rost-ds61\/\">episode #61<\/a> of the Data Stories podcast that now when she lives in a big city, Berlin, she often searches for directions &#8211; I do not. Well, compared to Berlin, Helsinki do is small, but we also have a superb web service for guiding us around here, <a href=\"http:\/\/www.reittiopas.fi\/en\/\">Journey Planner<\/a>. So instead of a search, I&#8217;ll go straight there.<\/p>\n<p>One area of digital life I&#8217;ve been increasingly interested in &#8211; and what this blog and my <a href=\"http:\/\/blogs.aalto.fi\/suoritin\">job blog<\/a> reflect, too, I hope &#8211; is coding. Note, &#8220;coding&#8221; not as in building software but as in scripting, mashupping, visualizing. Small-scale, proof-of-concept data wrangling. Learning by doing. Part of it is of course related to my day job at Aalto University. For example, now when we are setting up a <a href=\"http:\/\/www.altmetric.com\/blog\/cris-and-altmetrics\/\">CRIS system<\/a>, I&#8217;ve been transforming, with XSLT, legacy publication metadata to XML. It needs to validate against the Elsevier Pure XML Schema before it can be imported.<\/p>\n<p>A few years now, appart XSLT, the other languages I have been writing with, are R and Perl. Unix command line tools I use on a daily basis. Thanks to the D3 course, I&#8217;m also slowly starting to get familiar with JavaScript. Python has been on my list a longer time, but since the introductory course I took at <a href=\"https:\/\/www.csc.fi\/home\">CSC &#8211; IT Center for Science<\/a> some time ago, I haven&#8217;t really touched it.<\/p>\n<p><a href=\"https:\/\/www.reddit.com\/r\/ProgrammerHumor\/comments\/3dxkka\/computer_programming_to_be_officially_renamed\/\">I&#8217;m not the only one<\/a> that googles while coding. Mostly it&#8217;s about a specific problem: I need to accomplish something but cannot remember or don&#8217;t know, how. When you are not a full-time coder, you forget details easily. Or, you get an error message you cannot understand. Whatever.<\/p>\n<p>Are my coding habits visible in the search history? If yes, in what way.<\/p>\n<p>First thing to do with the JSON files, was to merge them into one. For this, I turned to R.<\/p>\n<pre>library(jsonlite)\n \nfilenames &lt;- list.files(\"Searches\", pattern=\"*.json\", full.names=TRUE)\njsons.as.list &lt;- lapply(filenames, function(f) fromJSON(txt = f))\nalljson &lt;- toJSON(jsons.as.list)\nwrite(alljson, file = \"g.json\")\n<\/pre>\n<p>Then, just as Lisa did, I fired up Google Refine, and opened a new project on <code>g.json<\/code>.<\/p>\n<p>To do:<\/p>\n<ul>\n<li>add Boolean value columns for JavaScript, XSLT (including XPath), Python, Perl and R by filtering the query column with the respective search string<\/li>\n<li><a href=\"https:\/\/github.com\/OpenRefine\/OpenRefine\/wiki\/Recipes\">convert Unix timestamps to Date\/Time<\/a> (Epoch time to Date\/Time as String). For now, I&#8217;m only interested in date, not time of day<\/li>\n<li>export all Boolean columns and Date to CSV<\/li>\n<\/ul>\n<p><a href=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/refinesearch.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/refinesearch-300x221.png\" alt=\"Google Refine new column\" class=\"alignnone size-medium wp-image-332\" width=\"300\" height=\"221\" srcset=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/refinesearch-300x221.png 300w, https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/refinesearch-624x460.png 624w, https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/refinesearch.png 780w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>From the language names, R is the most tricky one to filter because it is just one character. Therefore, I need to build a longish <a href=\"https:\/\/github.com\/OpenRefine\/OpenRefine\/wiki\/GREL-Boolean-Functions\">Boolean or sentence<\/a> for that.<\/p>\n<p><a href=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/refiner.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/refiner-300x158.png\" alt=\"Google Refine text facet\" class=\"alignnone size-medium wp-image-334\" width=\"300\" height=\"158\" srcset=\"https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/refiner-300x158.png 300w, https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/refiner-1024x539.png 1024w, https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/refiner-624x328.png 624w, https:\/\/tuijasonkkila.fi\/wp-content\/uploads\/2015\/11\/refiner.png 1267w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>Here I&#8217;m ready with R and Date, and checking the results with a text facet on the column r.<\/p>\n<p>Thanks to a clearly commented template by the D3 course leader, <a href=\"https:\/\/github.com\/alignedleft\/knightd3-intermediate\">Scott Murray<\/a>, <a href=\"http:\/\/bl.ocks.org\/tts\/81f2f5c0b7bbf62d7b3a\">the stacked area chart<\/a> was easy to do, but only after I had figured out how to process and aggregate yearly counts by language. Guess what &#8211; I googled for a hint, <a href=\"http:\/\/stackoverflow.com\/a\/13893701\">and got it<\/a>. The trick was, while looping over all rows by language, to define an object to store counts by year. Then, for every key (=year), I could push values to the <code>dataset <\/code>array.<\/p>\n<p>Do the colors of the chart ring a bell? I&#8217;m a Wes Anderson fan, and have waited for an excuse to make use of some of the <a href=\"https:\/\/github.com\/kbroman\/wesandersonJS\">color palette implementations of his films<\/a>. This 5-color selection represents <a href=\"http:\/\/www.imdb.com\/title\/tt0362270\/\">The Life Aquatic With Steve Zissou<\/a>. The blues and browns are perhaps a little too close to each other, especially when used as inline font color, but anyway.<\/p>\n<p>Quite an R mountain there to climb, eh? It all started during the <a href=\"https:\/\/web.archive.org\/web\/20120415195727\/http:\/\/www.elag2012.com\/programme\/pre-conference\/an-introduction-to-r\">ELAG 2012<\/a> conference in Palma, Spain. Back then I was still working at the Aalto University Library. I had read a little about R before, but it was the pre-conference track <em>An Introduction to R<\/em> led by <a href=\"http:\/\/dlab.berkeley.edu\/people\/harrison-dekker\">Harrison Dekker<\/a>, that finally convinced me that I needed to learn this. I guess it was the easiness of installing packages (always a nightmare with Perl), reading in data, and quick plotting.<\/p>\n<p>So what does the big amount of R searches tell? For one thing, it shows my active use of the language. At the same time though, it tells that I&#8217;ve needed a lot of help. A lot. I still do.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Since few days now, I&#8217;ve had my Google search archive with me. In my case, it&#8217;s a collection of 38 JSON files, containing search strings and timestamps. The oldest file dates back to mid-2006, which acts as a digital marriage certificate of us, me and the Internet giant. It took no more than 15 minutes &hellip; <a href=\"https:\/\/tuijasonkkila.fi\/?p=321\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">About 43000 results<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[149],"tags":[77,48,102,98,100,99,13,101],"class_list":["post-321","post","type-post","status-publish","format-standard","hentry","category-diverse-coding","tag-d3-js","tag-google-refine","tag-google-search","tag-javascript","tag-perl","tag-python","tag-r","tag-xslt"],"_links":{"self":[{"href":"https:\/\/tuijasonkkila.fi\/index.php?rest_route=\/wp\/v2\/posts\/321","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tuijasonkkila.fi\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tuijasonkkila.fi\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tuijasonkkila.fi\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tuijasonkkila.fi\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=321"}],"version-history":[{"count":22,"href":"https:\/\/tuijasonkkila.fi\/index.php?rest_route=\/wp\/v2\/posts\/321\/revisions"}],"predecessor-version":[{"id":1004,"href":"https:\/\/tuijasonkkila.fi\/index.php?rest_route=\/wp\/v2\/posts\/321\/revisions\/1004"}],"wp:attachment":[{"href":"https:\/\/tuijasonkkila.fi\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=321"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tuijasonkkila.fi\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=321"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tuijasonkkila.fi\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=321"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}