DevOps in Amazon

Amazon is a real Ops first/hard-core technology company. DevOps is a popular key word in IT world today, but only a few who can put it into best practice and drive core business growth.

  1. CI/CD tools is first step. Must have a good pipeline from development stage to production stage. That pipeline should integrate with security check, integration test etc.
  2. Code review and bar raiser.
  3. One main branch Git.
  4. Canary Deployment tools in order to release the new feature to a subset of users progressively.
  5. Alarms and Monitor tools (infra, exceptions, errors…)
  6. Dashboard Metrics tools (gathering users info after launch )
  7. Developers take life cycle of code (dev, test, production bugs fix etc.)
  8. High skills coworkers ( Managers, Product Managers, Developers)
  9. Good communication channel (Instant message tools, ticket tracking tools, knowledge sharing tools)

No real DevOps can achieve if missing any above step, in my option.

Software Basic – Definitions

Manifest: A manifest file in computing is a file containing metadata for a group of accompanying files that are part of a set or coherent unit. (wiki)

Metadata: Metadata is “data that provides information about other data”.[1] In other words, it is “data about data.” (wiki)

Artifact: A software build contains not only the developer’s code also includes a range of software artifacts. A DevOps artifact is a by-product produced during the software development process. It may consist of the project source code, dependencies, binaries or resources, and could be represented in different layout depending on the technology. Software artifacts are usually stored in a repository. (Jfrog)

Snapshot: A snapshot version in Maven is one that has not been released.
It’s basically “x.0 under development”.

AWS Basic – Network

Here are the AWS Networking knowledges that are fundamental for cloud computing.

Region: (e.g. us-east-1)

AWS has the concept of a Region, which is a physical location around the world where we cluster data centers.

Each AWS Region is designed to be isolated from the other AWS Regions. This design achieves the greatest possible fault tolerance and stability.

VPC:

The Amazon Virtual Private Cloud (Amazon VPC) service lets you provision a private, isolated section of the AWS Cloud where you can launch AWS services and other resources in a virtual network that you define.

Availability Zone: (e.g: us-east-1a)

An Availability Zone (AZ) is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. AZ’s give customers the ability to operate production applications and databases that are more highly available, fault tolerant, and scalable than would be possible from a single data center.

Subnet:

Separate subnets for unique routing requirements. AWS recommends using public subnets for external-facing resources and private subnets for internal resources. For each Availability Zone, this Quick Start provisions one public subnet and one private subnet by default.

Internet Gateway:

An internet gateway is an access point through which your resources can access the internet and be accessed from the internet.

NAT Gateway:

A NAT gateway can route outgoing traffic from private subnets to the internet.

Route 53:

Amazon Route 53 is the DNS available for your AWS resources.

FinTech in US – The first impression

After one month and half living in US, I had some observations about the fintech system in US.

  • Less “Neobank”: unlike in EU (UK and France) there are many so called “NeoBank”, in USA, there are less.Personally, I think in USA, the big banks, they don’t have so many fees and their applications, web site are already well designed, user friendly, fast and secure. When you open new accounts, you can have some bonus and many banks provide “cash back” system for their customers. So there is no need for “Neobank”, the large and transitional bank already do their work. In this case, when you look back at EU FinTech system, I think in the long run, the big banks will transform and work with so called “Neobanks”. The Neobank concept will disappear later, in my option.
  • Everything is about “Credit”. In US, the credit score is the key for personal finance. People spend lots of money in credit. But in EU, people used to save money. For the credit card, only big banks can provide. The Neobank can only provide Debit card, I think that is also one of the reason why people don’t need Neobank in US.
  • “Plaid” is a big success. The biggest success in FinTech world so far, is the company “Plaid“. I’ve been watching for it for several years. They build beautiful bank connection APIs and they only work for API no other bullshits. And today, so many popular APPs are based on Plaid in US, like Robinhood, Venmo, Coinbase etc. Started in 2013, in 7 Years, Plaid is acquired by Visa with $5.3 billion. What a story ! So many French FinTech should learn from it! In France, “Budget Insight” seems to do the same thing like Plaid, and famous App Lydia work under its API. That is why Tencent recently led the  $45 million Series B round on Lydia. I think in France, there are some good FinTech startups who do the right thing. Like Luko, Alan, Payfit, Qonto and Lydia, personally I see a good future for them.
  • Stock Market is very active. In US, the stock market is very active and people can really make money and become rich on it. Thanks to the FinTech like Robinhood now everyone can buy and sell the stocks easily. In EU especially in France, that is not the case..

Generally I think, US is the paradise for FinTech. But in EU, FinTech is very hard. I guess the best way is to find a win-win way with large traditional banks in the future..

Green Lake Park – Seattle

Couple weeks ago, when the sun rarely came out, I decided to go to a park for jogging and I searched on google map and the park Green Lake seemed like a good idea because its location is very convenient for access. By car, it is about 15 mins from SLU to it and 30 mins by bus.

When I got there, there was already many people and they were cycling around the lake for walking or running.

Logging in Java

  • SLF4J: SLF4J stands for Simple Facade For Java. It is nothing but only a facade for logging system. It doesn’t do the logging implementation work. It is logging design pattern.

If you only include SLF4J.jar in your project, what messages will you get? Here is very simple and easy understanding example from SLF4J. You will get those warning messages:

<!-- wp:shortcode -->
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
<!-- /wp:shortcode -->

Simply because there is no logger implementation component.

  • Log4J / Logback:

Logger4J or Logback are Java logging Framework who implementation SLF4J.

Here is every good picture from SLF4J manual: to help you understand the layers between log4j/logback/ other framework and SLF4J.

  • Use Cases:

When you design a Java Library which will be included and used by other project, you should only include SLF4J in this library and you give the chances to the using projects who choice the Java Logging Framework.

If you written a Service and Application, you should use SLF4J + Log4j or SLF4J + Logback etc.

In short, libraries and other embedded components should consider SLF4J for their logging needs because libraries cannot afford to impose their choice of logging framework on the end-user. On the other hand, it does not necessarily make sense for stand-alone applications to use SLF4J. Stand-alone applications can invoke the logging framework of their choice directly. In the case of logback, the question is moot because logback exposes its logger API via SLF4J.

From slf4j FAQ

Percentile – Monitoring

When we want to monitor the distributed system, we usually use “percentile”. For example, P99 – that means percentile 99, we mesure the performance until 99% and we exclude the last 1% performance.

Concret example, we say a Service’s latency P99 = 100ms, that means 99% of service response time is less than 100ms.

Normally, the calculate of percentile is expensive. Because we have to take for example 100 samples and order them , find the 99th one.

For monitoring, we usually take P50, P99 and P99.9.

Here is a good example by Elastic which can help to understand the concept. And anther one for going deeper.

The links:

https://www.elastic.co/blog/averages-can-dangerous-use-percentile

https://blog.bramp.net/post/2018/01/16/measuring-percentile-latency/

Single point of failure (SPOF)

In distributed system world, Single point of failure is a key word that you should always be aware.

It means if a part of system fails, the whole system will be down. For example, if Service A sends messages to Service B via a single instance of message queue, then if the queue fails, the communication between Service A and B will be completely loses. Then this message queue is SPOF of the system.

The key to remove SPOF is using “Redundancy“, here is very well document by Oracle that explains the point.

The system “Reliability” explained by Amazon.

The links:

https://docs.oracle.com/cd/E19424-01/820-4806/fjdch/index.html

https://wa.aws.amazon.com/wat.pillar.reliability.en.html

First month at Amazon – “Culture Shock”

It has been nearly a month since I started to work at Amazon in Seattle. To be honest, as a software developer, I kind of realize my years “dream” to work in a the top notch world class tech company.

But when in this big war ship, you find yourself quickly be educated or shocked by Amazon strong company culture and I would like to share some of them.

  1. “Day 1” culture, each day should be considered as your first day at Amazon, that means you should be always passioned, motived and curious.
  2. “Customer Obsession”: “We Start With the Customer and We Work Backward”. “Focus on customers over competitors”…
  3. “Empty chair”: It is said early Jeff used to put an empty chair in each meeting. That empty chair represent our customer and what will he/she say or expected..
  4. “Two pizza” team rule: In the early days of Amazon, Jeff Bezos put a rule: every internal team should be small enough that it can be fed with two pizzas..
  5. Word Doc over PPT: Amazon love word document over PPT and each document should not be over 6 pages..
  6. Amazon loves writing: we have our internal wiki tools, you could find anything in that wiki site. We put design documents, any thoughts and everything useful into written wiki pages..
  7. “You own your proper career”: in amazon everyone could be leader, at least you are the leader of your self. You are given enough space and freedom to be driven by your own idea and actions. You don’t need to wait the orders by someone else.

Solving Github “Invalid username or password” Problem.

We could encounter an “Invalid username or password” problem when we enable 2FA two-factor auth and try “git push“.

The solution is to instead of using your GitHub account password, you need to generate a secret token.

Here is the instruction from Github: https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line

Finally, you should do is:

$ git clone https://github.com/username/repo.git
Username: your_username
Password: your_token