Saturday, June 30, 2018
Wednesday, June 27, 2018
Forms of Evidence
A paper can be viewed as an assembly of evidence and supporting explanations; that is, as an attempt to persuade others to share your conclusions. Good science uses objective evidence to achieve aims such as to persuade readers to make more informed decisions and to deepen their understanding of problems and solutions. In a write-up you pose a question or hypothesis, then present evidence to support your case. The evidence needs to be convincing because the processes of science rely on readers being critical and skeptical; there is no reason for a reader to be interested in work that is inconclusive.
There are, broadly speaking, four kinds of evidence that can be used to support a hypothesis: proof, modelling, simulation, and experiment.
Proof. An proof is a formal argument that a hypothesis is correct (or wrong). It is a mistake to suppose that the correctness of a proof is absolute—confidence in a proof may be high, but that does not guarantee that it is free from error; it is commonfor a researcher to feel certain that a theorem is correct but have doubts about the mechanics of the proof. Some hypotheses are not amenable to formal analysis, particularly hypotheses that involve the real world in some way. For example, human behaviour is intrinsic to questions about interface design, and system properties can be intractably complex. Consider an exploration to determine whether a new method is better than a previous one at lossless compression of images—is it likely that material that is as diverse as images can be modelled well enough to predict the performance of a compression algorithm? It is also a mistake to suppose that an asymptotic analysis is always sufficient. Nonetheless, the possibility of formal proof should never be overlooked.
Model. A model is a mathematical description of the hypothesis (or some compo- nent of the hypothesis, such as an algorithm whose properties are being considered) and there will usually be a demonstration that the hypothesis and model do indeed correspond. In choosing to use a model, consider how realistic it will be, or conversely how many simplifying assumptions need to be made for analysis to be feasible. Take the example of modelling the cost of a Boolean query on a text collection, in which the task is to find the documents that contain each of a set of words. We need to estimate the frequency of each word (because words that are frequent in queries may be rare in documents); the likelihood of query terms occurring in the same document (in practice, query terms are thematically related, and do not model well as random co-occurrences); the fact that longer documents contain more words, but are more expensive to fetch; and, in a practical system, the probability that the same query had been issued recently and the answers are cached in memory. It is possible to define a model based on these factors, but, with so many estimates to make and parameters to tune, it is unlikely that the model would be realistic.
Simulation. A simulation is usually an implementation or partial implementation of a simplified form of the hypothesis, in which the difficulties of a full implemen- tation are sidestepped by omission or approximation. At one extreme a simulation might be little more than an outline; for example, a parallel algorithm could be tested on a sequential machine by use of an interpreter that counts machine cycles and com- munication costs between simulated processors; at the other extreme a simulation could be an implementation of the hypothesis, but tested on artificial data. A simula- tion is a “white coats” test: artificial, isolated, and conducted in a tightly controlled environment.
A great advantage of a simulation is that it provides parameters that can be smoothly adjusted, allowing the researcher to observe behaviour across a wide spec- trum of inputs or characteristics. For example, if you are comparing algorithms for removal of errors in genetic data, use of simulated data might allow you to control the error rate, and observe when the different algorithms begin to fail. Real data may have unknown numbers of errors, or only a couple of different error rates, so in some sense can be less informative. However, with a simulation there is always the risk that it is unrealistic or simplistic, with properties that mean that the observed results would not occur in practice. Thus simulations are powerful tools, but, ultimately, need to be verified against reality.
Experiment. An experiment is a full test of the hypothesis, based on an implementation of the proposal and on real—or highly realistic—data. In an experi- ment there is a sense of really doing it, while in a simulation there is a sense of only pretending. For example, artificial data provides a mechanism for exploring behav- iour, but corresponding behaviour needs to be observed on real data if the outcomes are to be persuasive. In some cases, though, the distinction between simulation and experiment can be blurry, and, in principle, an experiment only demonstrates that the hypothesis holds for the particular data that was used; modelling and simulation can generalize the conclusion (however imperfectly) to other contexts. Ideally an experiment should be conducted in the light of predictions made by a model, so that it confirms some expected behaviour. An experiment should be severe; seek out tests that seem likely to fail if the hypothesis is false, and explore extremes. The traditional sciences, and physics in particular, proceed in this way. Theoreticians develop models of phenomena that fit known observations; experimentalists seek confirmation through fresh experiments.
There are, broadly speaking, four kinds of evidence that can be used to support a hypothesis: proof, modelling, simulation, and experiment.
Proof. An proof is a formal argument that a hypothesis is correct (or wrong). It is a mistake to suppose that the correctness of a proof is absolute—confidence in a proof may be high, but that does not guarantee that it is free from error; it is commonfor a researcher to feel certain that a theorem is correct but have doubts about the mechanics of the proof. Some hypotheses are not amenable to formal analysis, particularly hypotheses that involve the real world in some way. For example, human behaviour is intrinsic to questions about interface design, and system properties can be intractably complex. Consider an exploration to determine whether a new method is better than a previous one at lossless compression of images—is it likely that material that is as diverse as images can be modelled well enough to predict the performance of a compression algorithm? It is also a mistake to suppose that an asymptotic analysis is always sufficient. Nonetheless, the possibility of formal proof should never be overlooked.
Model. A model is a mathematical description of the hypothesis (or some compo- nent of the hypothesis, such as an algorithm whose properties are being considered) and there will usually be a demonstration that the hypothesis and model do indeed correspond. In choosing to use a model, consider how realistic it will be, or conversely how many simplifying assumptions need to be made for analysis to be feasible. Take the example of modelling the cost of a Boolean query on a text collection, in which the task is to find the documents that contain each of a set of words. We need to estimate the frequency of each word (because words that are frequent in queries may be rare in documents); the likelihood of query terms occurring in the same document (in practice, query terms are thematically related, and do not model well as random co-occurrences); the fact that longer documents contain more words, but are more expensive to fetch; and, in a practical system, the probability that the same query had been issued recently and the answers are cached in memory. It is possible to define a model based on these factors, but, with so many estimates to make and parameters to tune, it is unlikely that the model would be realistic.
Simulation. A simulation is usually an implementation or partial implementation of a simplified form of the hypothesis, in which the difficulties of a full implemen- tation are sidestepped by omission or approximation. At one extreme a simulation might be little more than an outline; for example, a parallel algorithm could be tested on a sequential machine by use of an interpreter that counts machine cycles and com- munication costs between simulated processors; at the other extreme a simulation could be an implementation of the hypothesis, but tested on artificial data. A simula- tion is a “white coats” test: artificial, isolated, and conducted in a tightly controlled environment.
A great advantage of a simulation is that it provides parameters that can be smoothly adjusted, allowing the researcher to observe behaviour across a wide spec- trum of inputs or characteristics. For example, if you are comparing algorithms for removal of errors in genetic data, use of simulated data might allow you to control the error rate, and observe when the different algorithms begin to fail. Real data may have unknown numbers of errors, or only a couple of different error rates, so in some sense can be less informative. However, with a simulation there is always the risk that it is unrealistic or simplistic, with properties that mean that the observed results would not occur in practice. Thus simulations are powerful tools, but, ultimately, need to be verified against reality.
Experiment. An experiment is a full test of the hypothesis, based on an implementation of the proposal and on real—or highly realistic—data. In an experi- ment there is a sense of really doing it, while in a simulation there is a sense of only pretending. For example, artificial data provides a mechanism for exploring behav- iour, but corresponding behaviour needs to be observed on real data if the outcomes are to be persuasive. In some cases, though, the distinction between simulation and experiment can be blurry, and, in principle, an experiment only demonstrates that the hypothesis holds for the particular data that was used; modelling and simulation can generalize the conclusion (however imperfectly) to other contexts. Ideally an experiment should be conducted in the light of predictions made by a model, so that it confirms some expected behaviour. An experiment should be severe; seek out tests that seem likely to fail if the hypothesis is false, and explore extremes. The traditional sciences, and physics in particular, proceed in this way. Theoreticians develop models of phenomena that fit known observations; experimentalists seek confirmation through fresh experiments.
Friday, June 22, 2018
Thursday, June 21, 2018
Tuesday, June 12, 2018
Some books
- A Arte de Enganar, por William Simon e Kevin Mitnick
- A Arte e a Ciência de Memorizar Tudo, por Joshua Foer
- A elite do atraso: da escravidão à Lava Jato. Rio de Janeiro, por Jessé Souza
- A espiral da morte: Como a humanidade alterou a máquina do clima, por Claudio Angelo
- A Genética do Esporte, por David Epstein
- A história do corpo humano, por Daniel E. Lieberman
- A informação – Uma história, uma teoria, uma enxurrada, por James Gleick
- A Mais Pura Verdade Sobre a Desonestidade, por Dan Ariely
- A Medida do Mundo: A Busca por um Sistema Universal de Pesos e Medidas, por Robert Crease
- A radiografia do golpe: entenda como e por que você foi enganado, por Jessé Souza
- A ralé brasileira: quem é e como vive, por Jessé Souza
- A Sabedoria das Multidões, por James Surowiecki
- A Segunda Era das Máquinas, por Erik Brynjolfsson e Andrew McAfee
- A tolice da inteligência brasileira: ou como o país se deixa manipular pela elite, por Jessé Souza
- A Vingança dos Analógicos, por David Sax
- Acredite, Estou Mentindo, por Ryan Holiday
- As 48 Leis do Poder, por Greene
- Behave: The Biology of Humans at Our Best and Worst, por Robert Sapolsky
- Behave: The Biology of Humans at Our Best and Worst, por Robert Sapolsky
- Blockchain Revolution [em português], por Don Tapscott, Alex Tapscott
- Breve História de Quase Tudo, por Bill Bryson
- Caos: A Criação De Uma Nova Ciência, por James Gleick
- Casa-Grande & Senzala, por Gilberto Freyre
- Colapso, por Jared Diamond
- Como Aprendemos, por Benedict Carey
- Creating a Learning Society: A New Approach to Growth, Development, and Social Progress (Kenneth Arrow Lecture Series), por Joseph E. Stiglitz and Bruce C. Greenwald
- Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World, por Bruce Schneider
- De que é feito o universo?: A história por trás do prêmio Nobel de física, por Richard Panek
- Death by Black Hole: And Other Cosmic Quandaries, por Neil deGrasse Tyson
- Deep Survival, por Laurence Gonzales
- Digital Gold: Bitcoin and the Inside Story of the Misfits and Millionaires Trying to Reinvent Money, por Nathaniel Popper
- E O Cérebro Criou o Homem, por António R. Damásio
- Evolução e corpo humano
- Five Billion Years of Solitude, por Lee Billings
- Greek Fire, Poison Arrows, and Scorpion Bombs: Biological and Chemical Warfare in the Ancient World, por Adrienne Mayor
- Guerra Mundial Z: Uma história oral da guerra dos zumbis, por Max Brooks
- Impérios Da Comunicação, por Tim Wu
- Inheritance: How Our Genes Change Our Lives, and Our Lives Change Our Genes (English Edition), por Sharon Moalem
- Inventing Iron Man: The Possibility of a Human Machine, por Paul Zehr
- Lá vem todo mundo: o poder de organizar sem organizações, por Clay Shirky
- Life Unfolding: How the Human Body Creates Itself, por Jamie Davies
- Linked: A Nova Ciência dos Networks, por Albert-Laszlo Barabasi
- Making of the Atomic Bomb, por Richard Rhodes
- Merchants of Doubt: How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming, por Naomi Oreskes e Erik M. Conway
- Moral Tribes: Emotion, Reason, and the Gap Between Us and Them, por Joshua Greene
- Now – The Physics of Time, por Richard Muller
- Numbers Rule: The Vexing Mathematics of Democracy, from Plato to the Present, por George G. Szpiro
- O Andar do bêbado: como o acaso determina nossas vidas, por Leonard Mlodinow
- O Cérebro Imperfeito, por Dean Buonomano
- O Filtro Invisível: o que a internet está escondendo de você, por Eli Pariser
- O Gorila Invisivel, por Daniel Chabris e Christopher Simons
- O Humano Mais Humano, por Brian Christian
- O Humano Mais Humano, por Brian Christian
- O Instinto Da Linguagem: Como A Mente Cria A Linguagem, por Steven Pinker
- O Livro Dos Códigos, por Simon Singh
- O Otimista Racional, por Matt Ridley
- O Poder das Conexões, por Nicholas A Christakis
- O poder do hábito: Por que fazemos o que fazemos na vida e nos negócios, por Charles Duhigg
- O Que Nos Faz Felizes. O Futuro Nem Sempre E O Que Imaginamos, por Daniel Gilbert
- O que nos faz humanos, por Matt Ridley
- Os Centros Urbanos A Maior Invenção Da Humanidade, por Edward L. Glaeser
- Os Homens Que Encaravam Cabras, por Jon Ronson
- Os Inovadores, por Walter Isaacson
- Overcomplicated: Technology at the Limits of Comprehension, por Samuel Arbesman
- Para explicar o mundo: A descoberta da ciência moderna, por Steven Weinberg
- Pense no garfo! Uma história da cozinha e de como comemos, por Bee Wilson
- Perdido em Marte: Uma missão a Marte. Um terrível acidente. A luta de um homem pela sobrevivência, por Andy Weir
- Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, por Eric Siegel
- Previsivelmente Irracional, por Dan Ariely
- Próxima Parada: Marte, por Mary Roach
- Rápido e devagar: Duas formas de pensar, por Daniel Kahneman
- Risco – A Ciência e a Política do Medo, por Dan Gardner
- Sapiens. Uma Breve História da Humanidade, por Yuval Noah Harari
- Smarter Than You Think: How Technology Is Changing Our Minds for the Better, por Clive Thompson
- Spam Nation: The Inside Story of Organized Cybercrime-from Global Epidemic to Your Front Door, por Brian Krebs
- Superintelligence: Paths, Dangers, Strategies, por Nick Bostrom
- Talvez Você Também Goste, por Tom Vanderbilt
- The Age of Cryptocurrency: How Bitcoin and Digital Money Are Challenging the Global Economic Order, por Paul Vigna, Michael J. Casey
- The Art of Strategy: A Game Theorist’s Guide to Success in Business and Life, por Avinash Dixit e Barry J. Nalebuff
- The Dark Net, por Jamie Bartlett
- The Half-Life of Facts: Why Everything We Know Has an Expiration Date, por Samuel Arbesman
- The Improbability Principle: Why Coincidences, Miracles, and Rare Events Happen Every Day, por David J. Hand
- The Knowledge Illusion: Why We Never Think Alone, por Steven Sloman e Philip Fernbach
- The Righteous Mind: Why Good People Are Divided by Politics and Religion, por Jonathan Haidt
- The Science of Interstellar, por Kip Thorne
- The Secrets of Alchemy, por Lawrence M. Principe
- The Storytelling Animal: How Stories Make Us Human, por Jonathan Gottschall
- Um Antropólogo em Marte, por Oliver Sacks
- Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, por Cathy O’Neil
- Why Zebras Don’t Get Ulcers: The Acclaimed Guide to Stress, Stress-Related Diseases, and Coping, por Robert M. Sapolsky
Monday, June 11, 2018
Thursday, June 07, 2018
Wednesday, June 06, 2018
Subscribe to:
Posts (Atom)