Technology secrets behind Alibaba 11.11

This blog is inspired and based on the Alicloud WeChat article https://mp.weixin.qq.com/s/X1MtLk71LLZnsYZBVv-hHA

This year 2019-11.11, Alibaba Tmall double 11 sales event accomplished Turnover of 268.4 billion Chinese Yuan! The peak value of orders reached 544,000 units/second, and the data processing capacity per day reached 970PB! And all the system is based on Alibaba Cloud.

According to CTO Alibaba said in this blog, there were four technology secrets behind this.

  • 3rd Generation of X-Dragon Architecture. An AWS nitro similar technology.
  • OceanBase and PolarDB. Those are Alibaba’s self-made Databases.
  • Calculation and storage are separated. The storage is on remote and could be easily for expansion
  • RDMA(RemoteDirect Memory Access) in order to access the remote storage data quickly.

We could see in order to support and boost such large data requesting case, we should improve on the physical machine side and also database sides. Reading retrieving data quickly is the key.

AWS Cognito + MP JWT RBAC + Quarkus

In this blog, we will try to build a Role-Based-Access-Control (RBAC) with Quarkus, MicroProile JWT RBAC and AWS Cognito.

AWS Cognito will create JWT token and RSA Public Key Distribution. Quarkus is responsible for Java Server-side API endpoints.

Useful links:

Eclipse MicroProfile – JWT RBAC Security (MP-JWT)

QUARKUS – USING JWT RBAC

  • Create AWS Cognito User Pool and then in this User Pool create a User and Group. Here we use “Cognito Groups” as “User Roles”.
  • Create an AWS Cognito Identity Pool and get an identity pool Id , eg "eu-central-1_xxxxx". This Cognito Identity Pool will be the JWT Issuer and we could find the RSA Publicy Key under "https://cognito-idp.eu-central-1.amazonaws.com/eu-central-1_xxxxx/.well-known/jwks.json"
  • Create the endpoint by using Quarkus, for example:
@Path("/orders")
@RequestScoped
public class OrderResource {

    @GET
    @RolesAllowed({"USER", "ADMIN"})
    @Produces(MediaType.APPLICATION_JSON)
    public Response list(){
        return Response.ok(Arrays.asList("Order1", "Order2")).build();
    }
}

Most important since the default group claim in MP-JWT is “groups” but the Cognito group claim is “cognito:groups” so we need config a mapping.

smallrye.jwt.path.groups=cognito:groups

Other necessary configs:

mp.jwt.verify.publickey.location=https://cognito-idp.eu-central-1.amazonaws.com/eu-central-xxxxx.well-known/jwks.json

mp.jwt.verify.issuer=https://cognito-idp.eu-central-1.amazonaws.com/eu-central-xxxxx

quarkus.smallrye-jwt.enabled=true
quarkus.smallrye-jwt.auth-mechanism=MP-JWT
quarkus.smallrye-jwt.realm-name=Quarkus-JWT

For testing and getting a cognito jwt token you could try aws cli:

aws cognito-idp admin-initiate-auth --region eu-central-1 --cli-input-json file://auth.json

Then you put that token in the HTTP header “Authorization” and begins with “Bearer ” for example:

curl -X GET \
https://example/orders \
-H 'Authorization: Bearer YOUR_JWT_TOKEN' \

There you are the integration Quarkus + MP JWT and AWS Cognito. Enjoy!

Tips : H1b Visa Stamping in Paris France (Conseils pour préparer votre H1B visa à Paris France)

  • Fill the famous DS-160 form online.
  • Take the appointment online as early as possible.
  • Arrive early on the visa stamping day because there will be a very long queue outside the US embassy.
  • Don’t bring the laptop and Ipads with you because they are not allowed and you should search the nearby hotel to store them temporarily.
  • Bring everything about the documents even they are not asked on the list. For example, your CV, offer letter, etc.
  • Speak clearly and in detail about your experiences when they are asking.
  • In the end, you should ask if your visa is approved or checked!
  • If your visa unfortunately checked, be patient and send emails to USA Paris embassy for the update regularly.

Lisbon Travel Tips (里斯本 旅游攻略)

Lisbon is a beautiful city and suitable for a 3-4 days travel. After spending days here and I got some lessons learned during the travel. I want to write them down to help others to have a better stay.

  • Lisbon Airport is a mess, you should plan 2 hours ahead for the flights.
  • Uber and Bolt work in Lisbon, better to use them instead of Taxi. If you choose Taxi, ask the driver to use the taximeter. And prepare the cash.
  • At the airport, you could buy the Lisbon Card. With it, you could take free public transport like the tramway and access to many museums and historical sites.
  • Always buy the tickets for top attractions (Jeronimo’s monastery for example) online to avoid long queues.
  • Most of the museums are closed on Monday!!!
  • At the famous Pastéis de Belém near Jeronimo’s monastery, you could see a very long queue, but it is only for taking away. You could enter directly into the boutique to eat on the table. It is more comfortable and easy.
  • If you want to see Fado Music, you could go to “O Faia” restaurant. The music and food are top but they have minimum price each person is 50 euros and the show begins only at 21H30 PM.
  • Be careful to eat in the restaurants in the Alfama area, they could charge you service fees without telling you before.
  • You could download the “lonely planet app” to help you organize your visits.
  • In the end, Lisbon is a tourist city and there are really many tourists and also thieves. And don’t forget to bring comfortable Sportif shoes!

Probabilistic Data Structures

Context: When we are dealing with a very large set of data or a data streaming, we could not put them all in the memory.

1.Bloom filter: Used to query if the item exists (Membership query). A Bloom filter is a bit array of m bits initialized to 0. To add an element, feed it to k hash functions to get k array position and set the bits at these positions to 1. To query an element, feed it to k hash functions to obtain k array positions. If any of the bits at these positions is 0, then the element is definitely not in the set. If the bits are all 1, then the element might be in the set.

2. HyperLogLog: used for estimating the number of distinct elements (Cardinality). HyperLogLog counter can count one billion distinct items with an accuracy of 2% using only 1.5 KB of memory. It is based on the bit pattern observation that for a stream of randomly distributed numbers if there is a number x with the maximum of leading 0 bits k, the cardinality of the stream is very likely equal to 2^k.

3. Count-Min Sketch: used for querying single item count(Frequency).The basic data structure is a two-dimensional d , w array of counters with d pairwise independent hash functions h1 … hd of range w.

Link from https://dzone.com/articles/introduction-probabilistic-0

REST API Part2

  • Stateless:  RESTFul API is stateless. A stateless protocol does not require the server to retain session information or status about each communicating partner for the duration of multiple requests.

UDP, HTTP, and IP are stateless protocols. TCP is stateful.

  • HEAD, GET, OPTIONS and TRACE are SAFE methods,which means they are intended only for information retrieval and should not change the state of the server. In other words, they should not have side effects.

 

  • Methods PUT and DELETE are defined to be idempotent, meaning that multiple identical requests should have the same effect as a single request. Methods GET, HEAD, OPTIONS, and TRACE, being prescribed as safe, should also be idempotent.

 

Some thoughts on Startup

It has been a year and a half that I am working in a Fintech Startup in Paris France. There are more and more personal thoughts which might be different and realistic.

  • B2C is hard. Nowadays, especially here in France, many well-funded startups are B2B.  Public users always like the free and easy to use products. Many B2C products gain from advertising (Facebook ) or their content (Spotify).

 

  • Scale and growth is the key. The investor gives you money only because your product can achieve extreme growth that means your product could be easily implemented in other countries or markets.

 

  • Business Model is vital.  Business Model -> revenue or higher invest funds-> growth-> good salary (perks)->top talents-> top team-> top product-> revenue or higher invest funds.  Business Model is every beginning of this cycle. Some startups might have greet users database but no business model and eventually be acquired by other Giants. (Like Zenly)

 

  • Bussiness drive IT.  IT supports business operation and growth. And business growth brings the Technique challenges (like Alibaba). Without business scenario, all the cutting-edge tech staff is nonsense.

 

  • Top Talent makes it different. Always people matters, people make it happen. Top talent drives the crazy growth of the startup.

AWS S3 SDK Java Feedback

  • Verify if Bucket already exists

call the doesBucketExist method

  • Create Subdirectory (folder): using key


PutObjectRequest request = new PutObjectRequest(bucketName, "folder/", new File(fileName));

  • Remove all the files under folder:

    using key like : folder/

  • Using Presigned Url to share object (public url)

Often we need upload a file (image) to S3 and get a tempory public URL of this object. (Be careful, you should always set private-access to this object)

We could use Presigned-url with an expiration period.

Code Example

Integration Okta with your web application

1. Introduction Okta

Okta is an amazing identity management SasS products. It provides single sign-on solution, serves as a Security Gateway and it can protect all the internal employee daily used applications.

It also provides Okta API, so you can integrate their solution into your application. For example, you can use okta authentication and authorization API to control login.

2. Create a Okta Web Application

You start by sign up in Okta Developer .  And then create a Web Application :

Specify Login redirect URIs

3. Integration Front End : okta_sign-in_widget

You could use Okta Sign-In Widget  as your login page.

var signIn = new OktaSignIn({baseUrl: 'https://dev-xxxx.oktapreview.com'});
  signIn.renderEl({
    el: '#widget-container'
  }, function success(res) {
    if (res.status === 'SUCCESS') {
      console.log('Do something with this sessionToken', res.session.token);
    } else {
    // The user can be in another authentication state that requires further action.
    // For more information about these states, see:
    //   https://github.com/okta/okta-signin-widget#rendereloptions-success-error
    }
  });
4. Integration Back End:  Implement the Authorization Code Flow

At a high-level, this flow has the following steps (copy from doc Okta):

  • Your application directs the browser to the Okta Sign-In page, where the user authenticates.
  • The browser receives an authorization code from your Okta authorization server.
  • The authorization code is passed to your application.
  • Your application sends this code to Okta, and Okta returns access and ID tokens, and optionally a refresh token.
  • Your application can now use these tokens to call the resource server (for example an API) on behalf of the user.

As mentioned in the above login widget javascript, you could get a res.session.token, you send this session token to backend controller.

Backend controller use this session token call /authorize endpoint to get a code (you might get a 302 response and in the response header you could find a location with a URI contains code param)

After that, using code to call /token endpoint to get the access and ID tokens.

Akka Actor – Scala

1. Akka Actor:

Actor Model: One of the technologies used to deal with deal with concurrent computation. (The other two are: Reactor and Event Driven).

2. Create Actor:

Actors are implemented by extending the Actor base trait and implementing the receivemethod.
Props is a configuration class to specify options for the creation of actors.


class SenderActor(message: String, receiverActor: ActorRef) extends Actor {

  var receivedMessage = ""

  def receive = {
    case WhoToSend(who) =>
      receivedMessage = s"$message, $who"
    case Message =>
      receiverActor !Print(receivedMessage)
  }
}
3. Actor communication(in scala):
  • ! means “fire-and-forget”, e.g. send a message asynchronously and return immediately. Also known as tell.
  • ? sends a message asynchronously and returns a Future representing a possible reply. Also known as ask.

object MainApp extends App {
  val system : ActorSystem = ActorSystem("MainApp")

  val receiverActor: ActorRef = system.actorOf(Receiver.props, "Receiver")

  val sender1: ActorRef = system.actorOf(Sender.props("Bob", receiverActor),"Sender1")

  sender1 ! WhoToSend("Tom")
  sender1 ! Message
}

Github Source Code