
{"id":2211,"date":"2015-07-24T09:18:22","date_gmt":"2015-07-24T08:18:22","guid":{"rendered":"http:\/\/www.matthijskamstra.nl\/blog\/?p=2211"},"modified":"2015-08-22T20:04:46","modified_gmt":"2015-08-22T19:04:46","slug":"scraping-with-haxe","status":"publish","type":"post","link":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/","title":{"rendered":"Scraping with Haxe"},"content":{"rendered":"<p>A quick post about something that grabbed my attention quickly.<\/p>\n<h3>Scraping<\/h3>\n<blockquote>\n<p>Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites.<\/p>\n<\/blockquote>\n<p>Source: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Web_scraping\" target=\"_blank\">https:\/\/en.wikipedia.org\/wiki\/Web_scraping<\/a><\/p>\n<p>I already did some research on the subject when I was playing around with my raspberry pi. There is a lot out there, especially for python.<\/p>\n<p>But what about my favourite programming language <strong>Haxe<\/strong>?<\/p>\n<p>Again this is a quick search! And this is what I found.<\/p>\n<p>A (very?) old project from <a href=\"https:\/\/github.com\/jonasmalacofilho\" target=\"_blank\">Jonas Malaco Filho<\/a> on github. Check out this code : <a href=\"https:\/\/github.com\/jonasmalacofilho\/jonas-haxe\" target=\"_blank\">jonas-haxe<\/a> and specificly the <a href=\"https:\/\/github.com\/jonasmalacofilho\/jonas-haxe\/tree\/haxe3migration\/src\/jonas\/scraper\" target=\"_blank\">scraper part<\/a> of it. Written for Neko, with primarily undocumented classes like <a href=\"http:\/\/api.haxe.org\/neko\/vm\/Mutex.html\" target=\"_blank\">neko.vm.Mutex<\/a> Once you have the html page you can start getting the data from it!<\/p>\n<p>You will need a html\/xml parser; I found one written by <a href=\"https:\/\/bitbucket.org\/yar3333\" target=\"_blank\">Yaroslav Sivakov<\/a> &#8211; <a href=\"https:\/\/bitbucket.org\/yar3333\/haxe-htmlparser\" target=\"_blank\">HtmlParser haxe library<\/a> It also can be found on haxelib: <a href=\"http:\/\/lib.haxe.org\/p\/HtmlParser\/\" target=\"_blank\">http:\/\/lib.haxe.org\/p\/HtmlParser\/<\/a><\/p>\n<p>I found a little (old) project haxe\/php project that I will post as a reference <a href=\"https:\/\/github.com\/andor44\/old_scraper\" target=\"_blank\">https:\/\/github.com\/andor44\/old_scraper<\/a>. But then it stops&#8230;<\/p>\n<p>Not a field that a lot of haxe-developers walk. Fun!<\/p>\n<h2>Update #2<\/h2>\n<ol>\n<li>The htmlparser doesn&#8217;t work with the html code I am scraping. So I need to focus the parts I want to use. Regular expressions are the way to go, and I suck at them. Luckily I found a online tool that helps with testing the <a href=\"http:\/\/haxe.org\/manual\/std-regex.html\" target=\"_blank\">regex<\/a>: <a href=\"http:\/\/www.regexr.com\/\" target=\"_blank\">http:\/\/www.regexr.com\/<\/a> from an old flash hero <a href=\"https:\/\/twitter.com\/gskinner\/\" target=\"_blank\">gskinner<\/a>.<\/li>\n<li>Another thing I ran into, was the data from <strong>https<\/strong> sites. You need something &#8220;extra&#8221; to download html files from there: install <a href=\"https:\/\/github.com\/tong\/hxssl\" target=\"_blank\">hxssl<\/a> via haxelib <code>haxelib install hxssl<\/code> and add it to your build.hxml <code>-lib hxssl<\/code><\/li>\n<\/ol>\n<h2>Update #1<\/h2>\n<p>I am coding this with openfl\/regular expressions, but perhaps a better way to-go is node.js! And you can use node.js with Haxe (perhaps not completely ready: <a href=\"https:\/\/github.com\/HaxeFoundation\/hxnodejs\" target=\"_blank\">hxnodejs<\/a> but probably good enough for the examples below).<\/p>\n<ul>\n<li><a href=\"https:\/\/scotch.io\/tutorials\/scraping-the-web-with-node-js\" target=\"_blank\">https:\/\/scotch.io\/tutorials\/scraping-the-web-with-node-js<\/a> <\/li>\n<li><a href=\"http:\/\/nrabinowitz.github.io\/pjscrape\/\" target=\"_blank\">http:\/\/nrabinowitz.github.io\/pjscrape\/<\/a> <\/li>\n<li><a href=\"https:\/\/medialab.github.io\/artoo\/\" target=\"_blank\">https:\/\/medialab.github.io\/artoo\/<\/a> <\/li>\n<li><a href=\"https:\/\/github.com\/ruipgil\/scraperjs\" target=\"_blank\">https:\/\/github.com\/ruipgil\/scraperjs<\/a> <\/li>\n<li><a href=\"http:\/\/www.smashingmagazine.com\/2015\/04\/web-scraping-with-nodejs\/\" target=\"_blank\">http:\/\/www.smashingmagazine.com\/2015\/04\/web-scraping-with-nodejs\/<\/a> <\/li>\n<li><a href=\"https:\/\/impythonist.wordpress.com\/2015\/01\/06\/ultimate-guide-for-scraping-javascript-rendered-web-pages\/\" target=\"_blank\">https:\/\/impythonist.wordpress.com\/2015\/01\/06\/ultimate-guide-for-scraping-javascript-rendered-web-pages\/<\/a> <\/li>\n<li><a href=\"http:\/\/code.tutsplus.com\/tutorials\/screen-scraping-with-nodejs--net-25560\" target=\"_blank\">http:\/\/code.tutsplus.com\/tutorials\/screen-scraping-with-nodejs&#8211;net-25560<\/a> <\/li>\n<li><a href=\"http:\/\/noodlejs.com\/\" target=\"_blank\">http:\/\/noodlejs.com\/<\/a> <\/li>\n<li><a href=\"http:\/\/webscraper.io\/\" target=\"_blank\">http:\/\/webscraper.io\/<\/a> <\/li>\n<\/ul>\n<p>I can&#8217;t really say how to start with node.js and Haxe because I have never tried it, but what I have red about it shouldn&#8217;t be a big problem. Fun again!<\/p>\n<h3>Read this<\/h3>\n<p>Some interesting reads&#8230; somewhat related to haxe<\/p>\n<ul>\n<li><a href=\"https:\/\/blog.hartleybrody.com\/web-scraping\/\" target=\"_blank\">https:\/\/blog.hartleybrody.com\/web-scraping\/<\/a><\/li>\n<li><a href=\"http:\/\/blog.databigbang.com\/tag\/java\/\" target=\"_blank\">http:\/\/blog.databigbang.com\/tag\/java\/<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/databigbang\/stream-oriented-knuth-morris-pratt\" target=\"_blank\">https:\/\/github.com\/databigbang\/stream-oriented-knuth-morris-pratt<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>A quick post about something that grabbed my attention quickly. Scraping Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. Source: https:\/\/en.wikipedia.org\/wiki\/Web_scraping I already did some research on the subject when I was playing around with my raspberry pi. There is a lot out there, especially [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":2231,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[360,385],"tags":[412,396,395,394],"class_list":["post-2211","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-haxe","category-openfl","tag-haxe","tag-web-data-extraction","tag-web-harvesting","tag-web-scraping"],"aioseo_notices":[],"aioseo_head":"\n\t\t<!-- All in One SEO 4.9.8 - aioseo.com -->\n\t<meta name=\"description\" content=\"A quick post about something that grabbed my attention quickly. Scraping Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. Source: https:\/\/en.wikipedia.org\/wiki\/Web_scraping I already did some research on the subject when I was playing around with my raspberry pi. There is a lot out there, especially\" \/>\n\t<meta name=\"robots\" content=\"max-image-preview:large\" \/>\n\t<meta name=\"author\" content=\"Matthijs Kamstra\"\/>\n\t<link rel=\"canonical\" href=\"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/\" \/>\n\t<meta name=\"generator\" content=\"All in One SEO (AIOSEO) 4.9.8\" \/>\n\t\t<meta property=\"og:locale\" content=\"en_US\" \/>\n\t\t<meta property=\"og:site_name\" content=\"[mck] | a polymath zapper\" \/>\n\t\t<meta property=\"og:type\" content=\"article\" \/>\n\t\t<meta property=\"og:title\" content=\"Scraping with Haxe | [mck]\" \/>\n\t\t<meta property=\"og:description\" content=\"A quick post about something that grabbed my attention quickly. Scraping Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. Source: https:\/\/en.wikipedia.org\/wiki\/Web_scraping I already did some research on the subject when I was playing around with my raspberry pi. There is a lot out there, especially\" \/>\n\t\t<meta property=\"og:url\" content=\"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/\" \/>\n\t\t<meta property=\"article:published_time\" content=\"2015-07-24T08:18:22+00:00\" \/>\n\t\t<meta property=\"article:modified_time\" content=\"2015-08-22T19:04:46+00:00\" \/>\n\t\t<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n\t\t<meta name=\"twitter:title\" content=\"Scraping with Haxe | [mck]\" \/>\n\t\t<meta name=\"twitter:description\" content=\"A quick post about something that grabbed my attention quickly. Scraping Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. Source: https:\/\/en.wikipedia.org\/wiki\/Web_scraping I already did some research on the subject when I was playing around with my raspberry pi. There is a lot out there, especially\" \/>\n\t\t<script type=\"application\/ld+json\" class=\"aioseo-schema\">\n\t\t\t{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/#article\",\"name\":\"Scraping with Haxe | [mck]\",\"headline\":\"Scraping with Haxe\",\"author\":{\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/author\\\/admin\\\/#author\"},\"publisher\":{\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/#organization\"},\"image\":{\"@type\":\"ImageObject\",\"url\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/wp-content\\\/uploads\\\/scraping11.jpg\",\"width\":2000,\"height\":745},\"datePublished\":\"2015-07-24T09:18:22+01:00\",\"dateModified\":\"2015-08-22T20:04:46+01:00\",\"inLanguage\":\"en-US\",\"commentCount\":2,\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/#webpage\"},\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/#webpage\"},\"articleSection\":\"Haxe, Openfl, Haxe, web data extraction, web harvesting, Web scraping\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/#breadcrumblist\",\"itemListElement\":[{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog#listItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\",\"nextItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/category\\\/haxe\\\/#listItem\",\"name\":\"Haxe\"}},{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/category\\\/haxe\\\/#listItem\",\"position\":2,\"name\":\"Haxe\",\"item\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/category\\\/haxe\\\/\",\"nextItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/#listItem\",\"name\":\"Scraping with Haxe\"},\"previousItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog#listItem\",\"name\":\"Home\"}},{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/#listItem\",\"position\":3,\"name\":\"Scraping with Haxe\",\"previousItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/category\\\/haxe\\\/#listItem\",\"name\":\"Haxe\"}}]},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/#organization\",\"name\":\"[mck]\",\"description\":\"a polymath zapper\",\"url\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/author\\\/admin\\\/#author\",\"url\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/author\\\/admin\\\/\",\"name\":\"Matthijs Kamstra\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/#authorImage\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/06ff22a1197b6624946e5a3377184f11ddc00ac06a6f1d2311b9d2072bdf61b1?s=96&d=wavatar&r=g\",\"width\":96,\"height\":96,\"caption\":\"Matthijs Kamstra\"}},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/#webpage\",\"url\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/\",\"name\":\"Scraping with Haxe | [mck]\",\"description\":\"A quick post about something that grabbed my attention quickly. Scraping Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. Source: https:\\\/\\\/en.wikipedia.org\\\/wiki\\\/Web_scraping I already did some research on the subject when I was playing around with my raspberry pi. There is a lot out there, especially\",\"inLanguage\":\"en-US\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/#website\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/#breadcrumblist\"},\"author\":{\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/author\\\/admin\\\/#author\"},\"creator\":{\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/author\\\/admin\\\/#author\"},\"image\":{\"@type\":\"ImageObject\",\"url\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/wp-content\\\/uploads\\\/scraping11.jpg\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/#mainImage\",\"width\":2000,\"height\":745},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/2015\\\/07\\\/24\\\/scraping-with-haxe\\\/#mainImage\"},\"datePublished\":\"2015-07-24T09:18:22+01:00\",\"dateModified\":\"2015-08-22T20:04:46+01:00\"},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/\",\"name\":\"[mck]\",\"description\":\"a polymath zapper\",\"inLanguage\":\"en-US\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.matthijskamstra.nl\\\/blog\\\/#organization\"}}]}\n\t\t<\/script>\n\t\t<!-- All in One SEO -->\n\n","aioseo_head_json":{"title":"Scraping with Haxe | [mck]","description":"A quick post about something that grabbed my attention quickly. Scraping Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. Source: https:\/\/en.wikipedia.org\/wiki\/Web_scraping I already did some research on the subject when I was playing around with my raspberry pi. There is a lot out there, especially","canonical_url":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/","robots":"max-image-preview:large","keywords":"","webmasterTools":{"miscellaneous":""},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/#article","name":"Scraping with Haxe | [mck]","headline":"Scraping with Haxe","author":{"@id":"https:\/\/www.matthijskamstra.nl\/blog\/author\/admin\/#author"},"publisher":{"@id":"https:\/\/www.matthijskamstra.nl\/blog\/#organization"},"image":{"@type":"ImageObject","url":"https:\/\/www.matthijskamstra.nl\/blog\/wp-content\/uploads\/scraping11.jpg","width":2000,"height":745},"datePublished":"2015-07-24T09:18:22+01:00","dateModified":"2015-08-22T20:04:46+01:00","inLanguage":"en-US","commentCount":2,"mainEntityOfPage":{"@id":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/#webpage"},"isPartOf":{"@id":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/#webpage"},"articleSection":"Haxe, Openfl, Haxe, web data extraction, web harvesting, Web scraping"},{"@type":"BreadcrumbList","@id":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/#breadcrumblist","itemListElement":[{"@type":"ListItem","@id":"https:\/\/www.matthijskamstra.nl\/blog#listItem","position":1,"name":"Home","item":"https:\/\/www.matthijskamstra.nl\/blog","nextItem":{"@type":"ListItem","@id":"https:\/\/www.matthijskamstra.nl\/blog\/category\/haxe\/#listItem","name":"Haxe"}},{"@type":"ListItem","@id":"https:\/\/www.matthijskamstra.nl\/blog\/category\/haxe\/#listItem","position":2,"name":"Haxe","item":"https:\/\/www.matthijskamstra.nl\/blog\/category\/haxe\/","nextItem":{"@type":"ListItem","@id":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/#listItem","name":"Scraping with Haxe"},"previousItem":{"@type":"ListItem","@id":"https:\/\/www.matthijskamstra.nl\/blog#listItem","name":"Home"}},{"@type":"ListItem","@id":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/#listItem","position":3,"name":"Scraping with Haxe","previousItem":{"@type":"ListItem","@id":"https:\/\/www.matthijskamstra.nl\/blog\/category\/haxe\/#listItem","name":"Haxe"}}]},{"@type":"Organization","@id":"https:\/\/www.matthijskamstra.nl\/blog\/#organization","name":"[mck]","description":"a polymath zapper","url":"https:\/\/www.matthijskamstra.nl\/blog\/"},{"@type":"Person","@id":"https:\/\/www.matthijskamstra.nl\/blog\/author\/admin\/#author","url":"https:\/\/www.matthijskamstra.nl\/blog\/author\/admin\/","name":"Matthijs Kamstra","image":{"@type":"ImageObject","@id":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/#authorImage","url":"https:\/\/secure.gravatar.com\/avatar\/06ff22a1197b6624946e5a3377184f11ddc00ac06a6f1d2311b9d2072bdf61b1?s=96&d=wavatar&r=g","width":96,"height":96,"caption":"Matthijs Kamstra"}},{"@type":"WebPage","@id":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/#webpage","url":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/","name":"Scraping with Haxe | [mck]","description":"A quick post about something that grabbed my attention quickly. Scraping Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. Source: https:\/\/en.wikipedia.org\/wiki\/Web_scraping I already did some research on the subject when I was playing around with my raspberry pi. There is a lot out there, especially","inLanguage":"en-US","isPartOf":{"@id":"https:\/\/www.matthijskamstra.nl\/blog\/#website"},"breadcrumb":{"@id":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/#breadcrumblist"},"author":{"@id":"https:\/\/www.matthijskamstra.nl\/blog\/author\/admin\/#author"},"creator":{"@id":"https:\/\/www.matthijskamstra.nl\/blog\/author\/admin\/#author"},"image":{"@type":"ImageObject","url":"https:\/\/www.matthijskamstra.nl\/blog\/wp-content\/uploads\/scraping11.jpg","@id":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/#mainImage","width":2000,"height":745},"primaryImageOfPage":{"@id":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/#mainImage"},"datePublished":"2015-07-24T09:18:22+01:00","dateModified":"2015-08-22T20:04:46+01:00"},{"@type":"WebSite","@id":"https:\/\/www.matthijskamstra.nl\/blog\/#website","url":"https:\/\/www.matthijskamstra.nl\/blog\/","name":"[mck]","description":"a polymath zapper","inLanguage":"en-US","publisher":{"@id":"https:\/\/www.matthijskamstra.nl\/blog\/#organization"}}]},"og:locale":"en_US","og:site_name":"[mck] | a polymath zapper","og:type":"article","og:title":"Scraping with Haxe | [mck]","og:description":"A quick post about something that grabbed my attention quickly. Scraping Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. Source: https:\/\/en.wikipedia.org\/wiki\/Web_scraping I already did some research on the subject when I was playing around with my raspberry pi. There is a lot out there, especially","og:url":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/","article:published_time":"2015-07-24T08:18:22+00:00","article:modified_time":"2015-08-22T19:04:46+00:00","twitter:card":"summary_large_image","twitter:title":"Scraping with Haxe | [mck]","twitter:description":"A quick post about something that grabbed my attention quickly. Scraping Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. Source: https:\/\/en.wikipedia.org\/wiki\/Web_scraping I already did some research on the subject when I was playing around with my raspberry pi. There is a lot out there, especially"},"aioseo_meta_data":{"post_id":"2211","title":null,"description":null,"keywords":null,"keyphrases":null,"primary_term":null,"canonical_url":null,"og_title":null,"og_description":null,"og_object_type":"default","og_image_type":"default","og_image_url":null,"og_image_width":null,"og_image_height":null,"og_image_custom_url":null,"og_image_custom_fields":null,"og_video":null,"og_custom_url":null,"og_article_section":null,"og_article_tags":null,"twitter_use_og":false,"twitter_card":"default","twitter_image_type":"default","twitter_image_url":null,"twitter_image_custom_url":null,"twitter_image_custom_fields":null,"twitter_title":null,"twitter_description":null,"schema":{"blockGraphs":[],"customGraphs":[],"default":{"data":{"Article":[],"Course":[],"Dataset":[],"FAQPage":[],"Movie":[],"Person":[],"Product":[],"ProductReview":[],"Car":[],"Recipe":[],"Service":[],"SoftwareApplication":[],"WebPage":[]},"graphName":"","isEnabled":true},"graphs":[]},"schema_type":"default","schema_type_options":null,"pillar_content":false,"robots_default":true,"robots_noindex":false,"robots_noarchive":false,"robots_nosnippet":false,"robots_nofollow":false,"robots_noimageindex":false,"robots_noodp":false,"robots_notranslate":false,"robots_max_snippet":null,"robots_max_videopreview":null,"robots_max_imagepreview":"large","priority":null,"frequency":null,"local_seo":null,"breadcrumb_settings":null,"limit_modified_date":false,"ai":null,"created":"2024-12-11 08:53:24","updated":"2025-06-04 10:50:41","seo_analyzer_scan_date":null},"aioseo_breadcrumb":"<div class=\"aioseo-breadcrumbs\"><span class=\"aioseo-breadcrumb\">\n\t\t\t<a href=\"https:\/\/www.matthijskamstra.nl\/blog\" title=\"Home\">Home<\/a>\n\t\t<\/span><span class=\"aioseo-breadcrumb-separator\">&raquo;<\/span><span class=\"aioseo-breadcrumb\">\n\t\t\t<a href=\"https:\/\/www.matthijskamstra.nl\/blog\/category\/haxe\/\" title=\"Haxe\">Haxe<\/a>\n\t\t<\/span><span class=\"aioseo-breadcrumb-separator\">&raquo;<\/span><span class=\"aioseo-breadcrumb\">\n\t\t\tScraping with Haxe\n\t\t<\/span><\/div>","aioseo_breadcrumb_json":[{"label":"Home","link":"https:\/\/www.matthijskamstra.nl\/blog"},{"label":"Haxe","link":"https:\/\/www.matthijskamstra.nl\/blog\/category\/haxe\/"},{"label":"Scraping with Haxe","link":"https:\/\/www.matthijskamstra.nl\/blog\/2015\/07\/24\/scraping-with-haxe\/"}],"_links":{"self":[{"href":"https:\/\/www.matthijskamstra.nl\/blog\/wp-json\/wp\/v2\/posts\/2211","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.matthijskamstra.nl\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.matthijskamstra.nl\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.matthijskamstra.nl\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.matthijskamstra.nl\/blog\/wp-json\/wp\/v2\/comments?post=2211"}],"version-history":[{"count":8,"href":"https:\/\/www.matthijskamstra.nl\/blog\/wp-json\/wp\/v2\/posts\/2211\/revisions"}],"predecessor-version":[{"id":2238,"href":"https:\/\/www.matthijskamstra.nl\/blog\/wp-json\/wp\/v2\/posts\/2211\/revisions\/2238"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.matthijskamstra.nl\/blog\/wp-json\/wp\/v2\/media\/2231"}],"wp:attachment":[{"href":"https:\/\/www.matthijskamstra.nl\/blog\/wp-json\/wp\/v2\/media?parent=2211"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.matthijskamstra.nl\/blog\/wp-json\/wp\/v2\/categories?post=2211"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.matthijskamstra.nl\/blog\/wp-json\/wp\/v2\/tags?post=2211"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}