Data Collection and Preparation
Throughout this book we will be in the fortunate position of having datasets readily available for downloading and using to test the algorithms. This is, of course, less commonly the case when the desire is to learn about some new problem, when either the data has to be collected from scratch, or at the very least, assembled and prepared. In fact, if the problem is completely new, so that appropriate data can be chosen, then this process should be merged with the next step of feature selection, so that only the required data is collected. This can typically be done by assembling a reasonably small dataset with all of the features that you believe might be useful, and experimenting with it before choosing the best features and collecting and analysing the full dataset.
Often the difficulty is that there is a large amount of data that might be relevant, but it is hard to collect, either because it requires many measurements to be taken, or because they are in a variety of places and formats, and merging it appropriately is difficult, as is ensuring that it is clean; that is, it does not have significant errors, missing data, etc.
For supervised learning, target data is also needed, which can require the involvement of experts in the relevant field and significant investments of time.
Finally, the quantity of data needs to be considered. Machine learning algorithms need significant amounts of data, preferably without too much noise, but with increased dataset size comes increased computational costs, and the sweet spot at which there is enough data without excessive computational overhead is generally impossible to predict.
Feature Selection
It consists of identifying the features that are most useful for the problem under examination. This invariably requires prior knowledge of the problem and the data; our common sense was used in the coins example above to identify some potentially useful features and to exclude others.
As well as the identification of features that are useful for the learner, it is also necessary that the features can be collected without significant expense or time, and that they are robust to noise and other corruption of the data that may arise in the collection process.
Algorithm Choice
Given the dataset, the choice of an appropriate algorithm (or algo- rithms) is what this book should be able to prepare you for, in that the knowledge of the underlying principles of each algorithm and examples of their use is precisely what is required for this.
Parameter and Model Selection
For many of the algorithms there are parameters that have to be set manually, or that require experimentation to identify appropriate values.
Training
Given the dataset, algorithm, and parameters, training should be simply the use of computational resources in order to build a model of the data in order to predict the outputs on new data.
Evaluation
Before a system can be deployed it needs to be tested and evaluated for ac- curacy on data that it was not trained on. This can often include a comparison with human experts in the field, and the selection of appropriate metrics for this comparison.
Wednesday, February 24, 2016
Tuesday, February 23, 2016
Lessons from Predictably Irrational
- The Truth about Relativity: Why Everything Is Relative—Even When It Shouldn't Be
- Our mind is relative-oriented. Thus, we always try to based on something to make a decision.
- The Fallacy of Supply and Demand: Why the Price of Pearls—and Everything Else— Is Up in the Air
- The demand is controlled by the perception of value.
- The Cost of Zero Cost: Why We Often Pay Too Much When We Pay Nothing
- When we get a zero cost thing we felling an obligation to pay for that in somehow.
- The Cost of Social Norms: Why We Are Happy to Do Things, but Not When We Are Paid to Do Them
- The best rule. We living in two worlds: social and market. If we tryed to pay something social using a market system, probably we will have a problem.
- The Influence of Arousal: Why Hot Is Much Hotter Than We Realize
- Try to relax and do not make decision when you are in arousal state. The market of sex is based on this rule
- The Problem of Procrastination and Self-Control: Why We Can't Make Ourselves Do What We Want to Do
- Restricting our freedom (equally spaced deadlines) is the best cure for procrastination.
- The High Price of Ownership: Why We Overvalue What We Have
- We still believed that in general the ownership of something increases its value in the owner's eyes
- Keeping Doors Open: Why Options Distract Us from Our Main Objective
- In the context of today's world, we work just as feverishly to keep all our options open. Given a simple setup and a clear goal (in this case, to make money), all of us are quite adept at pur suing the source of our satisfaction, or, get many options
- The Effect of Expectations: Why the Mind Gets What It Expects
- When we believe beforehand that something will be good, therefore, it generally will be good—and when we think it will be bad, it will bad.
- The Power of Price: Why a 50-Cent Aspirin Can Do What a Penny Aspirin Can't
- We still believe that a expensive goods is better rather than a cheap goods.
- The Context of Our Character, Part I: Why We Are Dishonest, and What We Can Do about It
- Our honesty monitor is a quite flexible, adapting on our culture, and our decisions about honesty is based on a cost/benefit analysis.
- The Context of Our Character, Part II: Why Dealing with Cash Makes Us More Honest
- The days of cash are coming to a close. Cash is a drag on the profits of banks—they want to get rid of it. On the other hand, electronic instruments are very profitable.
- Beer and Free Lunches: What Is Behavioural Economics, and Where Are the Free Lunches?
- As long as these mechanisms provide more benefits than costs, we should consider them to be free lunches—mechanisms that provide net benefits to all parties.
Once we understand when and where we may make errone ous decisions, we can try to be more vigilant, force ourselves to think differently about these decisions, or use technology to overcome our inherent shortcomings.
Monday, February 22, 2016
Tuesday, February 09, 2016
A dificuldade de engajamento
Se uma organização fosse um time de futebol composto por 11 jogadores...
- Apenas 4 jogadores saberiam para que lado atacam...
- Apenas 2 se preocupariam em ganhar o jogo...
- Apenas 2 jogadores não estariam competindo contra alguém do seu próprio time!!
Monday, February 08, 2016
Friday, February 05, 2016
What is Value Stream Mapping?
“All we are doing is looking at the time line from the moment the
customer gives us an order to the point when we collect the cash. And we are reducing that time
line by removing the non-value-added wastes.” (Ohno, 1988)
- Overproduction: Producing items for which there are no orders.
-
Waiting Time: Employees standing about. Inventory at stand-still.
-
Unnecessary Transport: Moving material unnecessarily or long distances.
-
Over-processing: Using more steps to produce a product than necessary.
-
Excess Inventory: Retaining unnecessary inventory between process steps.
-
Unnecessary Movement: Any wasted motion by man or machine.
-
Defect: Making incorrect product.
Value is from the customer’s perspective, the customer being the person who uses the output.
Subscribe to:
Posts (Atom)