Summary:There is no difference between session and token in essence. They are both authentication mechanisms for user identities, but the verification mechanisms they implement are different.

This article is shared from Huawei Cloud Community “Session/Cookie/Token still confused? “, Author: Long Ge Notes.

I believe that there are not a few people who use JWT Token in the project, but I found that many articles on the Internet introduced the token incorrectly, so I made a comparison of cookie, session, and token (token in the text refers to jwt token). I believe everyone will gain something after reading it!

cookies

HTTP 0.9 was born in 1991. At that time, it was only to meet the requirements of browsing web documents, so there was only GET request, and it left after browsing. There was no connection between the two connections. This is why HTTP is stateless, because There was no such need at the beginning of its birth.

However, with the rise of the interactive Web (the so-called interactive means that you can not only browse, but also log in, post comments, shop and other user operations), simply browsing the Web can no longer meet people’s requirements. In order to record the user’s shopping cart records, there needs to be a mechanism to record the relationship of each connection, so that we know who the products added to the shopping cart belong to, so Cookie was born.

Cookies, sometimes its plural form Cookies. The type is “small text file”, which is the data (usually encrypted) stored on the user’s local terminal by some websites in order to identify the user’s identity and track the session, and the information is temporarily or permanently saved by the user’s client computer.

The working mechanism is as follows

Take adding to the shopping cart as an example, after each browser request, the server will store the item id in the cookie and return it to the client, the client will save the cookie locally, and the next time it will pass the last saved cookie to the Just give it to the server, so that each cookie saves the user’s product id, and the purchase record will not be lost

Carefully observe the above picture, I believe it is not difficult to find that as more and more products are in the shopping cart, the cookie size of each request is getting bigger and bigger, which is a big burden for each request, I just want to When a product is added to the shopping cart, why should the historical product records be returned to the server? The shopping cart information has actually been recorded in the server, isn’t it superfluous for the browser to do this?how to improve

session

After careful consideration, since the user’s shopping cart information will be stored in the server, it is only necessary to save information that can identify the user’s identity in the cookie and know who initiated the operation of adding to the shopping cart. The user’s identity information, the request body only needs to bring the product id added to the shopping cart this time, which greatly reduces the size of the cookie. We call this mechanism that can identify which request is initiated by which user as Session (session mechanism ), the generated string that can identify user identity information is called sessionId, and its working mechanism is as follows

  1. First, the user logs in, and the server will generate a session for the user and assign it a unique sessionId. This sessionId is bound to a certain user, that is to say, according to this sessionid (assumed to be abc), it can be queried which user it is. Then pass this sessionid to the browser through the cookie
  2. After that, the browser only needs to add sessionId=abc key-value pair in the cookie for each request to add a shopping cart. After the server finds the corresponding user according to the sessionId, it saves the passed commodity id in the server corresponding to the user. shopping cart

It can be seen that in this way, it is no longer necessary to pass all the product ids of the shopping cart in the cookie, which greatly reduces the burden of requests!

In addition, it is not difficult to observe from the above that the cookie is stored in the client, and the session is stored in the server. The sessionId needs to be passed by the cookie to make sense.

Pain points of session

It seems that the way of cookie + session solves the problem, but we ignore a problem. The above situation can work normally because we assume that the server works on a single machine, but in actual production, in order to ensure high availability, the general server needs at least The two machines use load balancing to determine which machine the request should go to.

As shown in the figure: After the client requests, the load balancer (such as Nginx) will decide which machine to hit

Assuming that the login request hits machine A, machine A generates a session and adds the sessionId in the cookie and returns it to the browser, then the problem arises: if the request hits B or C when adding a shopping cart next time, since the session is on machine A Generated, at this time, B and C cannot find the session, then there will be an error that the shopping cart cannot be added, and you have to log in again. What should I do at this time.There are mainly three ways

1. Session copy

A generates a session and copies it to B and C, so that each machine has a session, no matter which machine the request to add a shopping cart hits, since the session can be found, there will be no problem

Although this method is feasible, the disadvantages are also obvious:

  1. Multiple copies of the same session are saved, data redundancy
  2. It’s fine if there are few nodes, but if there are many nodes, especially Ali and WeChat, which have hundreds of millions of DAU, it may be necessary to deploy tens of thousands of machines, so the performance consumption caused by the increase of nodes and replication will also be huge.

2. Session sticking

This method allows each client request to be sent to a fixed machine. For example, after the browser login request is sent to machine A, all subsequent requests for adding a shopping cart are also sent to machine A. The sticky module of Nginx This method can be supported, such as sticking by ip or cookie, etc. For example, the way of sticking by ip is as follows

upstream tomcats {
 ip_hash;
  server 10.1.1.107:88;
  server 10.1.1.132:80;
}

In this way, after each client request reaches Nginx, as long as its ip remains unchanged, the value calculated according to the ip hash will be sent to a fixed machine, and there will be no problem that the session cannot be found. Of course, it is not difficult to see that this The shortcomings of the method are also obvious. What should I do if the corresponding machine is hung up?

3. Session sharing

This method is also commonly used by major companies at present. The session is saved in middleware such as redis and memcached. When a request comes, each machine goes to these middleware to get the session.

The disadvantage is actually not difficult to find, that is, each request needs to go to redis to get a session, which consumes a little performance. In addition, in order to ensure the high availability of redis, it must be clustered. Of course, for large companies, Redis clusters are basically deployed, so this solution can be said to be the first choice for large companies.

Token: no session!

Through the above analysis, we know that the identity of the user can be completed by sharing the session on the server side, but it is not difficult to find that there is also a small flaw: I have to set up a redis cluster for a verification mechanism? It is true that redis is widely used by large factories, but for small factories, their business volume may not reach the level of using redis, so is there any other user identity verification mechanism that does not use server to store sessions? This is what we will introduce today The main character: token.

First, the requester enters his user name and password, and then the server generates a token based on this. After the client gets the token, it will be saved locally, and then the token can be attached to the request header when requesting from the server.

I believe that after reading the above picture, you will find that there are two problems.

1. The token is only stored in the browser, but not in the server. In this case, I can make a token and pass it to the server.

Answer: The server will have a verification mechanism to verify whether the token is legal.

2. Why don’t you find the userid based on the sessionId like the session, so how do you know which user it is?

A: The token itself carries uid information

The first question, how to verify the token? We can learn from the signature mechanism of HTTPS to verify.First look at the components of the jwt token

It can be seen that the token is mainly composed of three parts

  1. header: specifies the signature algorithm
  2. payload: non-sensitive data such as user id and expiration time can be specified
  3. Signature: Signature, the server knows which signature algorithm it should use according to the header, and then uses the key to generate a signature for the head + payload according to the signature algorithm, and a token is generated.

When the server receives the token from the browser, it will first take out the header + payload in the token, generate a signature according to the key, and then compare it with the signature in the token. If it succeeds, it means that the signature is legal, that is, the token is legal.And you will find that our userId is stored in the payload, so you can get the userid directly in the payload after getting the token, avoiding the overhead of getting it from redis like a session

Voice-over: The header and payload actually exist in the form of base64. For the convenience of description, this step is omitted in this article.

You will find that this method is really wonderful, as long as the server guarantees that the key is not leaked, the generated token is safe, because if the token is forged, it cannot pass the signature verification link, and the token can be determined to be illegal.

It can be seen that this method effectively avoids the disadvantage that the token must be stored in the server and realizes distributed storage. However, it should be noted that once the token is generated by the server, it is valid until it expires, and the token cannot be invalidated. Unless a blacklist is set up for the token on the server, go through the blacklist before verifying the token. If it is in the blacklist, the token will be invalid, but once it is done, it means that the blacklist must be saved on the server. This is back to the session mode, wouldn’t it be nice to use the session directly? Therefore, the general practice is to remove the token locally when the client logs out to invalidate the token, and just regenerate the token for the next login.

In addition, it should be noted that the token is generally placed in the Authorization custom header of the header, not in the cookie. This is mainly to solve the problem that cookies cannot be shared across domains (detailed below)

A brief summary of Cookies and Tokens

What are the limitations of cookies?

1. Cookies cannot be shared across sites, so if you want to implement multi-application (multi-system) single sign-on (SSO), it will be very difficult to use cookies to do what you need (you need to use more complicated tricks to implement, If you are interested, you can see the reference link at the end of the article)

Voiceover: The so-called single sign-on means that in multiple application systems, users only need to log in once to access all mutually trusted application systems.

But if you use token to implement SSO, it will be very simple, as follows

Just add token to the authorize field (or other customization) in the header to complete the authentication of all cross-domain sites.

2. There is no such thing as a cookie in the native request of the mobile terminal, and the sessionid depends on the cookie, so the sessionid cannot be passed by the cookie. If the token is used, it does not exist because it is passed along with the authorize of the header. This problem, in other words token inherently supports mobile platforms and has good scalability

To sum up, token has the characteristics of simple storage and good scalability.

What are the disadvantages of token

Then someone asked, since tokens are so good, why do almost all big companies adopt the method of sharing sessions? Many people may hear tokens for the first time. Isn’t token good? Token has the following two disadvantages:

1. The token is too long

The token is the encoded style of the header and payload, so it is generally much longer than the sessionId, and it is likely to exceed the size limit of the cookie (the cookie generally has a size limit, such as 4kb). If the information you store in the token is longer, then The token itself will be longer. In this case, since you will bring the token every time you request, it will be a big burden on the request.

2. Not very safe

Many articles on the Internet say that tokens are more secure, but in fact they are not. If you are careful, you may have discovered that we said that tokens are stored in the browser. Let’s ask again, where is it stored in the browser? Since it is too long to be placed in the cookie and may cause the cookie to exceed the limit, it has to be placed in local storage, which will cause security risks, because local storage such as local storage can be directly read by JS, and from the above It is also mentioned that once the token is generated, it cannot be invalidated, and it must wait until it expires. In this way, if the server detects a security threat, the related token cannot be invalidated.

So token is more suitable for one-time command authentication, set a relatively short validity period

Misunderstanding: Cookies are less secure than tokens, such as CSRF attacks

First of all, we need to explain what a CSRF attack is

The attacker uses some technical means to trick the user’s browser to visit a website that he has authenticated and perform some operations (such as sending emails, sending messages, and even property operations such as transferring money and purchasing goods). Since the browser has been authenticated (cookie brings sessionId and other identity authentication information), the visited website will consider it to be a real user operation and run.

For example, if a user logs in to a bank website (assuming it is http://www.examplebank.com/, and the transfer address is http://www.examplebank.com/withdraw?amount=1000&transferTo=PayeeName), the cookie will contain The login user’s sessionid, the attacker can place the following code on another website

<img src="http://www.examplebank.com/withdraw?account=Alice&amount=1000&for=Badman">

Then, if a normal user clicks the above picture by mistake, since the request for the same domain name will automatically bring a cookie, and the cookie contains the sessionid of the normal login user, the transfer operation like the above will succeed on the server, which will cause great harm. Security Risk

The root cause of CSRF attacks is that for each request of the same domain name, its cookie will be automatically brought. This is determined by the browser’s mechanism, so many people believe that cookies are not safe.

Using token does avoid the problem of CSRF, but as mentioned above, since the token is stored in local storage, it will be read by JS, and it is not safe from a storage point of view (in fact, the correct way to protect against CSRF attacks is to use CSRF token)

Therefore, whether it is a cookie or a token, it is not safe from the perspective of storage, and there is a risk of exposure. What we call security is more about the security in transmission, which can be transmitted using the HTTPS protocol. In this case, the request header is It can be encrypted, which ensures the security in transmission.

In fact, it is unreasonable for us to compare cookies with tokens. One is the storage method and the other is the authentication method. The correct comparison should be session vs token.

Summarize

There is no difference between session and token in essence. They are both authentication mechanisms for user identities, but the verification mechanisms they implement are different (one is stored in the server, which is verified by obtaining middleware such as redis, and the other is stored in the client. , through signature verification), it is more reasonable to use session in most scenarios, but it is more appropriate to use token in single sign-on and one-time command authentication, it is best to choose a reasonable model in different business scenarios , in order to achieve twice the result with half the effort.

Click to follow and learn about Huawei Cloud’s fresh technologies for the first time~

#Cookie #session #token #confused #Personal #Space #HUAWEI #CLOUD #Developer #Alliance #News Fast Delivery

Leave a Comment

Your email address will not be published. Required fields are marked *