What are GET and POST requests?

!!! A GET request is the 'typical' request a browser would send when it loads a webpage, a POST request is the 'typical' request when a browser sends some data (for example filling in a form by sending an email and password), however sometimes there is crossover.

GET requests

The GET method is used to request data from a server / resource. It is the most common HTTP request.

GET Query String

Notably, the key=>value pairs (query string) is sent in the URL of a GET request, for example

GET /blog?page=2&limit=10

The above would probably be asking the server / application to return page 2, with a limit of 10 records.

As GET requests send variables through the URL, they are generally not used for any sensitive data - for example logging in with username and password. This would be prone to all sorts of security problems as people copy / paste URLs, browsers caches them, they remain in your browser history, they have length restrictions and even people looking over your shoulder could read them.

POST requests

POST requests are generally used to create or update a resource within an application, however they can be, and are often used for 'getting' data - particularly in AJAX heavy applications - API's often only respond to POST requests.

This is useful to understand for web scraping as API's generally return nicely structured data (eg JSON).

A POST request looks something like this:

Host: google.com

POST Query String

Note in the above POST request example the query string (name=blah&key=value...) is within the body of the request.

This differs from GET requests as the variables are 'more hidden' - not going to be copy pasted, or bookmarked for example. POST request query strings are also not imposed the same size restrictions as URL's, therefore they are used for uploading larger amounts of data - such as images.

How is this relevant to web scraping

Your web scraper will need to send GET and POST requests in order to gather the required data correctly. Some endpoints - particularly API's - will only respond to POST requests.

GET and POST requests are the most common HTTP requests, however there are others - as below. We generally don't need to worry about these for data collection.

List of HTTP methods:

  • GET
  • POST
  • PUT
  • HEAD

Checkout our Web Scraping Service