Selenium WebDriver Architecture

Wednesday, May 24, 2017

Selenium WebDriver Architecture

Suppose you are writing your tests in C# and using a common Selenium WebDriver API. That C# binding is going to send commands across the common WebDriver API. A driver (which ever you are using) is going to be listening at the other end. The driver basically interprets those commands and executes those on the actual browser. Results are returned to the code using the WebDriver API.

You write your test in lets say C# against that Selenium WebDriver API. That binding code is going to issue commands across WebDriver wire protocol (which is basically REST Based Web Service) that is able to interpret those commands. We also send HTTP requests to driver server. This driver server is an executable. Each one of the drivers has this driver server that basically listens on the port on your local machine. When the commands come in, the driver server interprets those commands and automates the browser and returns the results back.

You write WebDriver command like:
driver.findElement(By.Name("q"))

This is converted into SPI (Stateless Programming Interface):
findElement(using="name", value="q")

Then we call JSON Wire Protocol.

Selenium server uses JSON Wire Protocol commands to break down JSON object and proceed. This part of the code is dependent on which browser it is running on.

Also see: Learning Selenium Testing Tools - Third Edition by Raghavendra Prasad MG