Group Details Private

Global Moderators

Forum wide moderators

  • Um review pessoal de ferramentas para testes automatizados no mundo JavaScript – parte 1

    Meus “2 centavos” de algumas ferramentas (open source) que utilizei ou venho utilizando para garantir a qualidade em projetos de desenvolvimento de software Este é o primeiro de uma série de posts que irei escrever (e atualizar frequentemente) sobre minhas experiências com automação de testes utilizando ferramentas de testes que suportam JavaScript. Neste primeiro post … Continue lendo Um review pessoal de ferramentas para testes automatizados no mundo JavaScript – parte 1

    posted in Feed de Blogs e Posts
  • How do you help developers to test ?

    Hello guys, I have joined to some webinars from Test Masters Online, not sure if you heard about them, but this one really called my attention. The title looks a bit extreme, but you will see that is more about…

    Continue reading →

    posted in Feed de Blogs e Posts
  • RE: TypeError: protractor_1.element(...).get is not a function

    Stack trace e código podem ajudar na análise da comunidade…

    posted in Geral
  • What If Courage was Contagious?

    TLDR; Lead without leadership authority by assuming that courage is contagious
    I suspect most emotions are contagious. Laughter is contagious. Fear is contagious. Courage is contagious.

    All of these I’ve used as tactics in my work communication.
    <a name=“more”></a>

    I’m pretty sure I covered both Laughter and Fear in Dear Evil Tester, so in this blog post we’ll get serious and look at courage.

    Courage is Contagious

    This tactic can help you lead when you are not in a leadership position.

    This tactic can helps your team grow and embody the behaviours you want to encourage in your environment.

    This tactic can help improve communication on projects.

    Have the courage to:

    • do what needs to be done* step up and do the work you want to see done* do the work the way you want to see it done* learn what you want your team to learn* pair with your team members to learn* say the things in meetings that need to be said* say “No”* say “I don’t know”* ask questions that reveal unstated assumptions* ask questions that reveal conflicting understanding* ask open questions that allow someone to make their point clearly without you being seen to put words in their mouth* discuss the way forwardUnfortunately, of the emotion based tactics I mentioned, this was the harder of the three for me to implement. Because at the time I needed to implement it I didn’t have people in visible leadership positions doing the same.

    I had to draw on memories of multiple people, from incidents from multiple projects. Fortunately I had seen people I worked with do this in other companies.

    Hopefully you’ll have seen people interacting in this way and can take courage from their previous examples. If not, there’s always comics, movies and novels to draw inspiration from.

    If courage is contagious. If you start, people will follow.

    posted in Feed de Blogs e Posts
  • AMA: Separate Repository for e2e Tests?

    Liam asks… “I did enjoy reading the article about e2e test on wordpress. I noted that e2e test are in a separate repo. My question will be what is the workflow to make sure new changes does not break the e2e test on pull request ? For example, if a developer work on some changes, … Continue reading “AMA: Separate Repository for e2e Tests?”

    posted in Feed de Blogs e Posts
  • RE: Automação que envolva resgate de PIN do e-mail

    Bom, ja que tu realmente não quis considerar sobre o ambiente, segue mais um “helper” pra isso:

    posted in Geral
  • Cypress with Galen Study Project

    Hello peeps, Today I am going to post a study project that I have created to try Cypress and Galen. As some of you know Cypress doesn’t support cross browser testing at the moment. This is because the creators believe that nowadays you don’t really need to do functional tests on all the browsers, so […]

    posted in Feed de Blogs e Posts
  • Efficacy Presubmit

    By Peter Spragins
    with input from John Roane, Collin Johnston, Matt Rodrigues and Dave Chen

    A Brief History of Efficacy

    Originally named “Test Efficacy”, a small team was formed in 2014 to quantify the value of individual tests to the development process. Some tests were particularly valuable because they provided a reliable breakage signal for critical code. Some tests were not useful because they were non-deterministic or they never failed. Confoundingly, tests would change in value over time as well. The team’s initial intention was to present this information to developers and help them optimize the development process.

    To achieve the goal of informing developers about their tests, the team had to collect a huge amount of developer infrastructure/workflow data from a variety of sources across Google. Collecting all of this data in one place turned out to be incredibly valuable.

    In addition to collecting and processing the data, the team developed a somewhat radical philosophy towards running tests at scale: the only important results come from tests which deterministically fail. Running an additional test that you know will pass is not a valuable signal to developers, and likely a waste of resources.

    Background on Google Presubmit

    The process of committing code at Google has several testing stages. Perhaps the three most important testing stages are:

    1. Individual ad-hoc testing
    2. Presubmit
    3. Continuous build/continuous integration (hereafter referred to as continuous build).

    Stages 1 and 2 can actually be interleaved in any order and repeated any number of times.
    A presubmit executes all of the tests which are known to be affected by the edited code within one user’s proposed code changes. The “affected tests” are calculated with the help of a “project definition”, a configuration maintained by teams. A presubmit can run at any point during the change proposal process, but most importantly it must run before a user can permanently commit their changes.
    Continuous build, (3), is the continuous running of all tests within a project at the newest committed version of the code. Continuous build will execute tests even when they have already passed at presubmit.
    The same test may run several times at presubmit during the development process, one last time at presubmit before a commit and then finally once again at continuous build, after being merged into the main branch of Google’s huge repository. For this reason, a “missed failure” at presubmit is not a critical failure. The test will still be run at continuous build, and then rolled back if it fails.

    Efficacy Presubmit Service

    Efficacy Presubmit Service is the fusion of “running the right tests at the right time” with one of the largest collections of test/developer data in the world. The service has one simple job: save time and resources by not running, or even compiling, tests that we are very confident will pass at Presubmit. The ideal “Efficacy Presubmit” would predict which tests will pass ahead of time and only run tests which were going to fail. Then the user can get feedback from the failing tests, and fix their mistakes with the minimal possible cost of user and CPU time.
    To make this idea possible we have made one significant abstraction of the actual presubmit testing process. In a given presubmit there may be zero tests run, or many. In a presubmit with one test, if that test fails then the presubmit fails. In a presubmit with a thousand tests, only one failing test will still fail the presubmit. Efficacy Presubmit makes the abstraction that each of these test executions is an equivalent unit. This greatly simplifies creating a training dataset.

    Machine Learning / Probabilistic Safety

    Quick background on ML

    ML techniques and processes are quite well known throughout the industry at this point. The Tensorflow tutorials are a great introduction. The type of ML we use is classification. A classifier is essentially a mapping from the domain of the dataset, to the range of the classes. Mnist is a very famous example of classification. An mnist classifier maps from the domain of the input image to the range of digits {0, 1, …, 9}.

    In some other classification problems, the inputs are more “tabular”. A famous example of tabular classification is Iris Species. This is very similar to what Efficacy does.

    Efficacy’s Application of ML

    Given the abstraction on the presubmit testing process described above, predicting the outcomes of automated testing at a large company is a perfect machine learning problem in many ways. You have:

    1. The set of test executions and results is a very large labelled dataset

    2. Copious numerical feature columns with trustworthy values

    3. Recent failure history of each test

    4. Various “distance” metrics from edited source files to tests - i.e. is this a test for the edited code?

    5. Test size and runtime data

    6. Several dimensions that can be aggregated

    There are some aspects of the problem which make ML difficult as well:

    1. The classes are highly imbalanced with respect to labels (the vast majority of tests are going to pass, not fail)
    2. Flaky tests can mislead the model because their labels are “untrue”

    We chose to reduce the problem to binary classification. The model chooses whether or not to run the test. In other words, failure is the positive class, and everything else is the negative class.

    We pick a threshold that results in an extremely low number of false negatives - failing tests which are not run because the model thinks they would have passed. This does reduce the number of skipped tests, true negatives, in exchange for a very high margin of safety. In addition to this, tests will be run afterwards at continuous build anyway, making presubmit skipping very safe.

    Difficulties of Scale

    In addition to the problems that were natural to the “schema” of the dataset, we faced some problems due to the scale of Google’s testing.

    Many of these problems stem from the fact that Google works out of one large repository (paper, talk). Because of this some presubmits have a very large number of tests and some commits require a large number of presubmits before they are finished. This means that the service has to make predictions for a very large number of tests all at once. If a presubmit tried to run every test at Google, then the service would have to predict each test individually. That means N times the number of columns, etc. Loading the data to generate all of these feature values uses a lot of memory.

    Another difficulty of doing this work at scale is that even with very rare false negatives, they will still happen somewhat frequently. This requires our team to be open to communication with any customer team. In some cases we may have to tell them they were the victim of a very low probability event. In other cases we may find a bug, or room for improvement.


    The two key numbers for the system’s performance are sensitivity, the percentage of failing tests we actually execute, and specificity, the percentage of passing tests we actually skip. The two numbers go hand in hand. For a given model, requiring a higher sensitivity will result in a lower specificity, or vice versa. We can easily tune the percentage of tests skipped, resulting in changes to the fidelity of the testing signal the developers receive. When the system is wrong, it can have some negative impact to developers if the prediction is a false negative. Rarely, it will allow a developer to commit code that will break a test during continuous build. This results in a broken “project”, which takes some time to detect, and then a roll-back of the code. This requires some developer time, and a flexible mentality towards testing. In order to achieve a positive balance from this, we must extract millions of skipped tests for every negative developer experience. The sensitivity of our system is very high, and our specificity is around 25%.

    posted in Feed de Blogs e Posts
  • RE: Automação que envolva resgate de PIN do e-mail

    Olha, a verdade é que o ambiente onde tu está testando precisa de um ajuste. Algo que tu possa passar e aceite seguir. Dependendo do modelo, vale criar um whitelist (apenas em ambientes de teste) o qual tu passe um valor que fique nessa lista e deixei passar o fluxo. Validar esse tipo de coisa é mais caro do que a própria aplicação.

    posted in Geral
  • RE: Dúvidas sobre testes em web service

    Já vi uma galera usando RestSharp parece bem tranquilo de implementar :)

    posted in Geral