Sunday, January 31, 2021

Levels of Testing


 

Monitoring Microservices

  • Monitoring is an umbrella term that is used to talk about various aspects, such as general health checks of services, latency, logging, resource usage, and checking the well-being of the services and applications.
  • Profiling is about observing delays and understanding how much time each service is taking. It will help us to understand which services are taking time and pinpoint the problem areas.
  • Tracing is more about tracking the flow of control when a request is fired, more or less similar to profiling but with some different details. Distributed tracing extends the concept to a distributed system that has multiple Microservices deployed independently. For example, when the user hits a URL, which API is getting executed, which might call another service, and so on. We would like to know how the flow is moving and about the health of each service. Is there a service that is not responding or is slow? Distributed tracing will help us with this.
  • Logging can be anything; we just log all the events with parameter details. We can also log critical areas of the service or application, which can later help us understand what happened behind the scenes. Log monitoring can be done manually, or be automated through the use of log monitoring tools. 
  • Metrics are another important way to look at the health of your system. These can be produced using logs or tracing data that show how much time various services are taking, to see whether something needs special attention. Different types of metrics can be generated on an as per-need basis and provide information about the general health of the system at a glance.
  • Health checks are automated scripts or tools that keep tabs on the health of the services. This includes the health of hardware infrastructure as well as the availability of different services.
  • Alerting is the system that helps trigger an action when an unwanted or error condition is observed in the system. Email or messaging alerts can be sent based on need, and an escalation policy can be set up as per the system's requirements.

Scaling Microservices with caching

  • Client-side cache: The client will maintain their own cache. For this example, let's say the client is maintaining a simple hashtable, with a key-value pair, where the key is the employee ID and the value is the employee data. Whenever the client needs data for an employee, it will first check its internal cache, and if the employee ID is found in the client-side cache, it would not need to make a call to the server, saving a round trip to the server. The results will be lightning fast in this case, as the client is serving the results internally. As with any cache, we will need to take care of certain aspects, such as an expiring cache and limiting the number of records that can be cached. The number of records that can be cached is critical in client-side caching as we are dependent on the user's machine. If the caller to the employee service is another service, we can have a greater number of records in the cache, depending on who is calling and the business requirements. Similarly, we need to make sure to expire cache records. Again, this is dependent on our business needs, such as how often we expect our employee records to be updated. 
  • Server-side cache: Caching is done at the service level rather than the caller level. The Microservice will maintain a cache on its own. There are many libraries that provide caching off the shelf, such as Jcache or Memcached. You can also use third-party caching, such as a Redis cache, or build a simple caching mechanism within the code, as per your application's need. The core idea is that we need not do all the work again while refetching some data. For example, when we ask for an employee record against an employee ID, we might be fetching data from one or more databases and doing several calculations. The first time, we will do all the tasks, but then store the record in cache. Next time, when the data is asked for against the same employee id, we will just send back the record in the cache. Again, we need to consider aspects such as expiration and cache size, as we discussed in the case of a client-side cache.
  • Proxy caching: Proxy caching is another technique that is gaining popularity. The idea is not to directly hit the main application server. The request first goes to a Proxy Server, and if the required data or artifact is available on the Proxy Server, we can avoid a call to the main server. The Proxy Server is usually close to the client, mostly in the same geographical area, so it is faster to access. Moreover, it helps us reduce the load on the main server. 
  • Distributed caching: As the name suggests, distributed caching is a mechanism for maintaining a cache in more than one place. There are multiple ways to implement a distributed cache. In its simplest form, we just keep a copy of the cache in multiple places, which helps us to divide the load among multiple sources. This approach is useful when we are expecting too much load and the amount of data to be cached is not too great. The other example of distributed caching is when we have lots of data to cache, and we cannot create a single cache. We would divide the data to be cached into multiple caching servers. The division can be based on application requirements. In some cases, we can have the cache distributed based on geography. For example, users in India are being served from a different cache than users in the US. Another simple piece of logic for cache distribution can be based on data. For example, we cache employee details based on employee IDs on different machines, based on the first letters of their first names such as A to F on machine 1, G to M on machine 2, and N to Z on machine 3. The method can be devised based on application requirements, but the core idea is to cache data on multiple distributed machines for easy access.

So, what is NoSQL?

  • Key-value based database: This is perhaps the simplest kind of database. We are storing data in the form of key-value combinations. Think of a hashmap kind of structure, where we are adding a unique identifier as a key and data objects as values. It is very easy to scale, as long as we have unique keys. Examples of this kind of database include Redis and Riak.
  • Column-based databases: We have been using row-based relational databases for a long time. In row-based databases, we think of a record as a single object that can be stored in a database's table row.Examples of databases using a column-based storage mechanism are Cassandra and Vertica.
  • Document-based databases: This can be thought of as an extension to key-value based storage. In a key-value-based system, a value can be anything, but a document-based database adds a restriction for proper formatting to be followed for data being stored. The data that is stored as documents. Metadata is provided for each document being stored, for better indexing and searching. Examples of document-based storage databases are MongoDB and CouchDB.
  • Graph-based databases: This kind of database is useful when our data records are connected to other records in some way and we need a method to parse this connectivity. A simple example is when we are storing information for people, and we need to capture friendship information, such as P1 is a friend of P2, who in turn is a friend of P4 and P6, so we can capture some relationship between P1, P2, P4, P6, and so on. Examples of databases that use graph-based storage are Neo4J and OrientDB.

Challenges with Microservices

  • The right level of modularization: You need to be very careful in determining how your application can be divided into Microservices. Too few would mean you're not getting the proper advantage of Microservices, and too many services means a heavy dev-ops requirement, to make sure all the Microservices work well when deployed together. Too many Microservices can also have a performance impact due to inter-service communication needs. You need to carefully analyze the application and break it down into logical entities, based on what would make sense to be thought of as a separate module.
  • Different tech stacks to manage: One of the advantages of a Microservices-based architecture is that you are not dependent on one technical stack or language. For example, if one of the services is coded in Java, you can easily build another one in .NET or Python. But if you are not careful, this advantage can quickly become a problem. You might end up supporting dozens of technical stacks and managing expertise for each service independently. Movement of team members between projects or among teams is not an option if required, as one team might be working on a completely different tech stack than other.
  • Heavy reliance on Dev-Ops: If you are using too many Microservices, you need to monitor each one and make sure all the communication channels are healthy at all times.
  • Difficult fault management: If you have dozens of services communicating with each other, and one of those goes down or is slow to respond, it becomes difficult to identify the problem area. Also, you do not want that problem in a single service to impact other services, so you will need to make sure arrangements are in place to handle error situations.
  • Managing the data: As a rule of thumb, we try to make sure every Microservice manages its own data. But this is not always easy when data is shared among services, so we need to determine how much data each service should own and how the data should be shared among services.

Advantages of Microservices

 


  • Easy-to-manage Code: As we are able to modularize and divide our huge application code base into various Microservices, we are not dealing with the whole application code at any point in time.
  • Flexibility of choosing the tech stack: As every Microservice we create is potentially a separate application in itself with a different deployable, it is easy to choose a technology stack based on need. For example, if you have many Javabased services in an application, and a new requirement comes in which you feel can be easily handled by using Python instead of Java, you can go ahead build that service in Python, and deploy it without worrying about the rest of the application.
  • Scalability: As every Microservice can be deployed independently, it is easy to scale them without worrying about the impact on others. For example, let's say we know the reporting service is used heavily at every end of a quarter – we can scale only this Microservice by adding more instances, and the rest of the services remain unaffected.
  • Testing: Unit testing, to be more specific, is easy with a Microservices-based approach. If you are modifying the leave-management service with some new rules, you need not worry about other services, such as Employee Project Management. In the worst case, if your leave-management service breaks down due to faulty code, you can still edit and update the Employee project-related information without even knowing that some other service is broken.
  • Faster time to market: As we are dealing with only one part of the application at a time, it is easier to make changes and move to production. Testing and deployment efforts are minimal as we are dealing with a subset of the whole system at a time.
  • Easy to upgrade or modify: Let's say we need to upgrade a Microservice, upgrade software or hardware, or completely rewrite the service, this is much easier in a Microservice based architecture, as we are only upgrading one part of the application.
  • Higher fault tolerance: In a monolith architecture, one error can cause the whole system to crash. In a Microservice-based architecture, in the worst case, a single service will crash, but no other services will be impacted. We still need to make sure we are managing errors properly to take advantage of Microservice-based architecture.

What is Monolith design?

  • Single: In the case of an application, we are talking about a single piece of code or a deployable. An application that can and should be deployed on a single machine.
  • Indivisible: We cannot break the application down; there is no easy way to divide the application code or deployable.
  • Slow to change: This is more of an implication of monolith design. It is a wellknown fact that changing a small piece of code is easier than a big, monolith code, especially since you don't know what implications such a change will have.
  • Huge code base: As we are developing the application as a single unit, everything is placed under a single code base. 
  • Testing: As the application is managed as a single unit, we need to test the whole application, even if a small change is made, to make sure there are no integration or regression issues.
  • Availability: Let's say that, while updating an employee data report, a developer introduced an error that caused the system to run out of memory, which will bring down the whole system. So a report that actually might not add too much value to the system and may be used rarely by users has the capability of bringing down the whole system.
  • Scalability: The code is deployed as a single unit, hence we can only scale the application as a whole, making it a heavy operation. For example, if we just need to execute multiple instances of salary processing on pay day, we cannot do that in isolation; and we need to scale the whole application by providing more hardware firepower (vertical scaling) or make copies of the whole application on different machines (horizontal scaling).
  • Inflexible: Say we need to create a special reporting feature and we know a tool or framework is available in a different language than we use. It is not possible in a monolith architecture to use different programming languages or technologies. We are stuck with the original choice, which might have been made years ago and is not relevant anymore.
  • Difficult to upgrade: Even a small decision, such as moving from Java EE 6 to 8, would mean that the whole application, along with all the features, needs to be tested and even a small problem would hinder the upgrade, whereas if we were dealing with multiple small services, we could upgrade them independently and sequentially.
  • Slow Development and time to market: For these reasons, it is difficult to make changes to an application with a monolith design. This approach does not fit well in today's agile world, where customers need to see changes and modifications as soon as possible, rather than waiting for a release that takes years.

 

Sunday, January 24, 2021

A importância do bias nas redes neurais

 As redes neurais são camadas de matrizes contendo valores de pesos e bias, que são inicializados de forma aleatória, e que são ajustados durante o treinamento para modelar o problema com que se está trabalhando. Basicamente, o que uma rede neural busca fazer é construir uma função que representa uma relação complexa entre as variáveis de entrada e as variáveis de saída, cuja modelagem puramente matemática seria muito difícil, se não impossível.

Uma rede neural é capaz de modelar a relação entre variáveis de entrada e saída quando ela é difícil de ser descrita com funções matemáticas, como aquela representada pela curva vermelha na figura.

O papel dos pesos é determinar o impacto que o valor de cada variável de entrada tem no valor das variáveis de saída. Imagine, por exemplo, que um modelo esteja tentando estimar o preço de um carro com base em várias de suas características. Existe uma relação entre o preço e variáveis como a potência do motor, o número de portas, a quilometragem e o tempo decorrido do seu lançamento. Estas relações poderiam ser descritas na forma de uma equação linear:

tempo de lançamento, por exemplo, terá um impacto negativo no preço; quanto mais tempo decorrido, menor o valor. O peso w3 expressa justamente este impacto. Mas qual o valor do carro no lançamento, quando o tempo de lançamento é igual a zero? É justamente o valor do bias.

Para algumas variáveis, o bias não tem sentido prático (por exemplo, qual o valor do carro quando o número de portas é igual a zero?), mas matematicamente este conceito é necessário para expressar o resultado da variável de saída sem a influência da variável de entrada. Podemos pensar, por exemplo, em qual seria o valor de um carro sem portas, por mais que no mundo real tal carro não exista. O que importa é que, considerada a simplificação que estamos fazendo de uma relação linear, cada porta adicional terá um impacto constante no preço inicial teórico de um carro sem portas.

Nas redes neurais, o impacto do bias aparece de várias formas, que descrevemos a seguir.

Translação da rede neural

Voltemos à relação apresentada na primeira figura. Imagine que estamos buscando determiná-la com uma rede neural a partir dos dados incluídos como pontos verdes na figura abaixo.

A disposição dos pontos verdes faz sugerir que este gráfico cruza o eixo y no ponto amarelo. Perceba, ainda, que neste ponto, x é igual a zero. Ou seja, o ponto amarelo indica o valor de y quando x é igual a zero. Este é exatamente o bias da equação representada pela curva vermelha. Sem a inclusão de um bias no treinamento da rede que fará a modelagem deste problema, a rede é forçada a incluir um ponto na posição (0, 0), resultando na figura abaixo.

Esta não parece ser a solução ideal do problema, além de que claramente vai prejudicar o desempenho da rede neural na predição dos valores de y próximo a x = 0.

Em suma, o bias permite que a equação se desloque no gráfico, sem estar fixa ao ponto (0, 0). Em matemática cartesiana, esse deslocamento é chamado de translação.

Adiamento ou adiantamento de ativação

As funções de ativação são aplicadas logo após a saída de uma camada da rede neural, ou seja, após a multiplicação do valor em sua entrada pelo peso e o bias daquela camada. Usando como exemplo a função sigmoide, podemos observar que, sem um elemento de bias, a função de ativação também fica fixa no ponto (0, 0,5), como podemos ver abaixo:

Cada curva colorida representa uma ativação sigmoide aplicada em uma saída com um peso diferente: f(x) = sig(w.x)

Com a introdução de bias, o gráfico da ativação também pode transladar, cujo efeito pode ser traduzido como um “adiamento” ou “adiantamento” da ativação: valores maiores ou menores de x vão produzir o mesmo efeito de ativação.

Cada curva colorida representa a ativação sigmoide em camadas com mesmo peso mas bias diferentes: f(x) = sig(w.x + b)

Capacidade de generalização

Quando uma rede neural sem bias está sendo treinada, o peso é ajustado para cada instância de dado que ela recebe, em função deste valor (lembre-se que o peso w é necessariamente multiplicado pelo dado x). Com isso, pode ser que a alteração induzida por uma instância anule a alteração provocada por uma instância anterior, e a rede tenha dificuldade para convergir para a melhor solução. Já que o valor do bias (b) é independente do valor do dado, ele “nivela” o processo de treinamento, fazendo com que as alterações nos pesos, induzidas pelas dados, sejam mais pontuais, menos dramáticas. O treinamento assim é mais estável, o que tem impacto positivo na convergência à solução ideal.

Quando usar bias?

Como vimos, a introdução de bias dá muito mais flexibilidade ao tipo de problema que uma rede neural é capaz de modelar. Entretanto, devemos nos lembrar que os valores de bias também serão ajustados durante o treinamento, o que tem um impacto computacional. Logo, se temos boas razões para dispensar o uso do bias, esta ação é justificável. É o caso de problemas onde temos certeza de que o valor zerado das variáveis de entrada resulta em variáveis de saída também zeradas, como, por exemplo, a relação entre o peso e a altura: obviamente que altura zero corresponde a peso zero. Mas se não temos condições de estabelecer esta premissa, é mais seguro incluir o bias, e deixar que o próprio processo de treinamento faça com que ele convirja para zero, se for o caso.

Sunday, January 17, 2021

Pattern Analysis Summary


 

Space-Based Architecture


 


Microservices Architecture Pattern


 



Microkernel Architecture


 

Event-Driven Architecture


 



Layered Architecture


 



Software Project Estimation: What Control?

There are three primary project controls we can manipulate:

Scope—controls the “what” of the solution

Effort—represents the power we apply toward building the solution

Duration—represents the time we have available for finishing the project

By adjusting each of these controls within the project envelope, we can affect the project’s progress and ultimately can drive the project toward desired objectives. There are other project aspects that can be recognized as distinct secondary controls. They have direct and indirect impact on the primary controls, and they are also very important in their own right for managing the less tangible outcomes of a project. These are the Environment, Software Quality, Metrics, and Value.

Environment, the well-being and collaborative capacity of people, can be treated as a secondary control. It almost directly converts to Effort— motivated people deliver more effectively. In this sense Environment can be considered part of the Effort primary control, and by improving the environment, we increase the available effort that can be expanded toward the project goal. 

Software Quality, the well-being and capacity for change in the code base, can also be treated as a secondary control. However, for most teams this is only a theoretical control since the quality that the team can attain is constant (and maximum) within the envelope of a single project. Lowering the quality is of course possible, but it cannot be considered a control since it doesn’t make sense. In any case, improving software quality, if it can be done sustainably within the project, also translates to Effort because once the team is working at an improved quality level, they get to expend less effort for achieving a comparable result.

Metrics, the health of the adopted development processes. Driving toward more controlled processes improves the predictability of events on the project and the likelihood of a forecast being close to the actual outcome. To the extent that an improved process can be proven to facilitate better productivity, we can consider it an effective project control. However, let’s not forget that individuals and interactions come before processes and tools. Enforcing rigid processes will backfire when creativity and thinking are the primary activities of people (which is the case on software development projects).

Value of the end product. Prioritizing functionality, so that we first complete the more valuable pieces, is a critically important derisking technique. Prioritization is a variation of the scope control. Occasionally people will discover valuable functionality that has not been previously recognized as an objective for the project. Pivoting for value represents product control, not project control. However, the value of the end result is so crucial for the project success that the project must accommodate changes to scope where this has been deemed the correct course of action. If the change in scope cannot be contained within the current project envelope, then the whole project needs to be reframed.

Saturday, January 16, 2021

Software Development Laws

Brooks’ Law: “Adding manpower to a late software project makes it later.”

Gall’s Law: “A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.”

Parkinson’s Law: “Work expands so as to fill the time available for its completion.”

Conway’s Law: “Organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations.”


Context switching

We can “generate” available thrust, and up the angle of the projected line, by minimizing the amount of context switching. Context switching is caused by constant interruptions and fragmentation of the work in progress when working on too many things simultaneously, even if all of them contribute to the software solution. By reducing context switching, and providing an environment where programmers can zone in on their work for longer periods, we can increase the available thrust compared to an environment where these factors are not in consideration. Measuring how much effective time we gain by minimizing context switching is not very difficult. The nonscientific way to do it is to ask a developer about how much time a day they feel is being lost from interruptions, or to add 5–15 minutes for each interruption and get an average number of interruptions. Since context switching can easily contribute to 30–50% effective time loss (you read this correctly), it is imperative for a scrum master or project manager to familiarize themselves intimately with ways to minimize it: 

  • Keeping an open and functioning communication network within the team (communication does not equal chatter) 
  • Ensuring people are available for feedback when needed
  • Guarding the team from undesired external communications (when someone, a manager maybe, comes with a random and unrelated request)
  • Scheduling meetings according to how developers work
  • Identifying queues, applying WIP limits (work in progress limits), and facilitating short cycle times
  • Having stories clearly specified
  • Promoting direct interaction, communication, and collaboration between all team members (improving sociometrics)

These are all examples of things we can do to minimize context switching for the team. The time loss is not the only negative that comes with context switching. Sometimes brilliant ideas will disappear or never appear because the creative context was not preserved for long enough. These are losses that are difficult to measure, but not so difficult to feel.

Estimating the work


 

If too many pieces end up in the L and XL buckets (and if it looks like we won’t get enough chunks to get to the desired resolution), then we need to split those chunks further so we get them down to S or M. If most of the pieces get estimated as XS, then we need to zoom out and look at less detail.

Do not ask developers to provide high precision estimates for intricate functionality based on detailed specification. Ask them to provide high resolution estimates even if each individual estimate is of lower accuracy.

PDCA model for project control and the accompanying project management practices


 

Forecasting Mechanics

1. Define a scale (ballpark approximation)—Scale can be as wide as the expected project length or it can be some portion of it. It is not to be confused with an overall estimate of the project. It might be roughly similar in length, but the actual project estimate is produced later.

2. Define a resolution—Resolution represents the number of chunks in which we need to break up work in order to gain a sufficient statistical grip. Higher resolution provides an increased certainty, but it also costs progressively more to sustain.

3. Initial estimation and sizing—Based on the desired resolution and other factors, we need to design ranges for estimating effort. Once we go through the initial estimation, we can size the project and take a first guess at whether the team has enough capacity.

4. Adjust for calendar time—It is crucial to convert the effort estimates into calendar time. This is because the actual completion data we collect in the next step is based on calendar time. Once we convert effort estimation into calendar times, we can also speculate about optimistic and pessimistic completion dates for the project.

5. Collect data—Estimation data is only one input for the forecast. Tracking the actual implementation times and observing other contextual factors provide another set of input information. The relationship between estimates and actual completion times becomes the basis for adjustments and predictive accuracy.

6. Identify trends and scenarios—Once we have collected enough data, we can start analyzing, making projections, and evaluating different scenarios.

Corrective adjustment on shorter vs. longer project

 


Prediction increase with the amount of data


 

Why developers detest estimation and forecasting?

To be able to work. Makers and artists, which software engineers are, enjoy working. They like being useful and spending time tinkering with whatever happens to be in their field of interest. What they typically don’t enjoy is to deal with something they cannot perceive to be real, valuable, and true. They markedly dislike the situations where they are the originator of things with questionable worth.

For makers, it is preferable to describe a complex problem in a complex way, rather than sacrifice the truth and provide simplistic and untrue answer. Makers can bend a little and deal with uncertainty, but only for short periods of time. They prefer to spend their time making things. 

Why is this important? Because it is important to understand that software developers detest estimation, and forecasting by extension, since they cannot be proven to be true. They are only a guess. And for makers a guess represents little value. For this reason, we need to be sensitive and empathetic to their dislike when we need their input and cooperation. When developers see that we understand the binary unsustainability of our own request, they will oddly be more willing to help. But it is also important for another reason—when developers see that our forecasting efforts are ultimately designed to provide them with a more sensible environment for work, there is a material improvement in the relationship’s dynamics.

Software Project Estimation: Intelligent Forecasting, Project Control, and Client Relationship Management

  • “Certainty or safety is a basic need”—At some level every person needs safety. At the most elementary level, a person needs physical and psychological safety. This is true even when a person engages in an inherently risky endeavor like starting a new software project.
  • “A software team can deliver continuously within a controlled productivity range”— Modern delivery teams have mastered proven software engineering practices and have repeatedly demonstrated that their productivity can remain within constant limits throughout the duration of a project. We will take “constant limits” to mean that if there are two comparable pieces of functionality, then the team will complete these by expanding comparable amounts of effort, and we can expect this to hold true throughout the project (with the possible exception of the first few weeks when people are picking up speed).
  • “Project control is more important than record keeping”—On a software project, the primary responsibility of anyone involved is to take actions with the intent to control and steer the project to success. Bookkeeping is of secondary or ternary importance. The benefit of forecasting is to pull certain important decisions earlier in the project’s life. We are not forecasting to prove something right or wrong, nor are we estimating to keep a record and hold people accountable for the estimation numbers they produced. 
  • “It is only worth forecasting when there is ability to act”—A forecast on its own does not change the outcome of a project. There must be a real possibility that we make control decisions, which lead to measured and timely actions, and change the project’s parameters. If such readiness for action does not exist on a project, and will never exist regardless of new knowledge, then forecasting becomes useless.

Tuesday, January 05, 2021

Verb Tenses


(01) Present
(02) Present continuous
(03) Simple past
(04) Past continuous
(05) Present perfect
(06) Present perfect continuous
(07) Past perfect
(08) Past perfect continuous
(09) Future
(10) Future continuous
(11) Future perfect
(12) Future perfect continuous

(13) Conditional present
(14) Conditional present progressive
(15) Conditional perfect
(16) Conditional perfect progressive

(17) Present subjunctive
(18) Past subjunctive
(19) Past perfect subjunctive

(20) Imperative
(21) Present participle
(22) Past participle