Thursday, December 22, 2016
17 gatilhos mentais
Gatilho Mental #1 – Escassez
Quando percebemos que algo pode faltar
tomamos uma decisão por impulso,
Gatilho Mental #2 – Urgência
É um tipo de escassez.
Gatilho Mental #3 – Autoridade
Quando entendemos a autoridade de alguém
usamos a informação do mesmo sem pensar.
Gatilho Mental #4 – Reciprocidade
Quando alguém nos oferece um café sentimos a
obrigação de retribuir.
Gatilho Mental #5 – Prova
Social
Tentamos pensar parecido com o grupo que
pertencemos.
Gatilho Mental #6 – Porque
Precisamos entender os motivos para fazer
algo.
Gatilho Mental #7 – Antecipação
Gostamos de ficar curiosos com o que está por
vir.
Gatilho Mental #8 – Novidade
Coisas novas nos despertam prazer e criam
novas sinapses
Gatilho Mental #9 – Relação Dor
x Prazer
Temos uma tendência maior a fugir da dor do
que procurar o prazer.
Gatilho Mental #10 – Descaso
Quando queremos muito algo podemos desdenhar
o mesmo para parecer que aquilo não tem importância.
Gatilho Mental #11 – Compromisso
e coerência
Nossa cultura valoriza a congruência entre o
que dizemos e o que fazemos.
Gatilho Mental #12 – Paradoxo da
escolha
Quanto mais opções pior é para decidirmos.
Gatilho Mental #13 – História
Gostamos que a informação seja transmitida de
forma temporal e acompanhada de um certo romance.
Gatilho Mental #14 – Simplicidade
Gostamos de coisas mais simples: os 5 passos,
o caminho mais fácil, etc...
Gatilho Mental #15 – Referência
Decidimos somente com algo relativo (por isso
3 orçamentos).
Gatilho Mental #16 – Curiosidade
Caso falte parte da informação sobre algo que
queremos vamos tentar descobrir.
Gatilho Mental #17 – Inimigo
Comum
Crie um problema que você passou e outra
pessoa também, isso fara com que ambos fiquem conectados.
Tuesday, December 20, 2016
Wednesday, November 30, 2016
Steps to developing a usable algorithm.
- Model the problem.
- Find an algorithm to solve it.
- Fast enough? Fits in memory?
- If not, figure out why.
- Find a way to address the problem.
- Iterate until satisfied.
Monday, November 28, 2016
Thursday, November 24, 2016
List to Recognize and Measure the Data Quality
Accuracy. The value stored in the system for a data element is the right value for that occurrence of the data element. If you have a customer name and an address stored in a record, then the address is the correct address for the customer with that name. If you find the quantity ordered as 1000 units in the record for order number 12345678, then that quantity is the accurate quantity for that order.
Domain Integrity. The data value of an attribute falls in the range of allowable, defined values. The common example is the allowable values being “male” and “female” for the gender data element.
Data Type. Value for a data attribute is actually stored as the data type defined for that attribute. When the data type of the store name field is defined as “text,” all instances of that field contain the store name shown in textual format and not numeric codes.
Consistency. Theformandcontentofadatafieldisthesameacrossmultiplesourcesys- tems. If the product code for product ABC in one system is 1234, then the code for this product is 1234 in every source system.
Redundancy. Thesamedatamustnotbestoredinmorethanoneplaceinasystem.If,for reasons of efficiency, a data element is intentionally stored in more than one place in a system, then the redundancy must be clearly identified and verified.
Completeness. There are no missing values for a given attribute in the system. For example, in a customer file, there must be a valid value for the “state” field for every customer. In the file for order details, every detail record for an order must be completely filled.
Duplication. Duplicationofrecordsinasystemiscompletelyresolved.Iftheproductfile is known to have duplicate records, then all the duplicate records for each product are identified and a cross-reference created.
Conformance to Business Rules. The values of each data item adhere to prescribed business rules. In an auction system, the hammer or sale price cannot be less than the reserve price. In a bank loan system, the loan balance must always be positive or zero.
Structural Definiteness. Whereveradataitemcannaturallybestructuredintoindividual components, the item must contain this well-defined structure. For example, an indi- vidual’s name naturally divides into first name, middle initial, and last name. Values for names of individuals must be stored as first name, middle initial, and last name. This characteristic of data quality simplifies enforcement of standards and reduces missing values.
Data Anomaly. A field must be used only for the purpose for which it is defined. If the field Address-3 is defined for any possible third line of address for long addresses, then this field must be used only for recording the third line of address. It must not be used for entering a phone or fax number for the customer.
Clarity. Adataelementmaypossessalltheothercharacteristicsofqualitydatabutifthe users do not understand its meaning clearly, then the data element is of no value to the users. Proper naming conventions help to make the data elements well understood by the users.
Timely. The users determine the timeliness of the data. lf the users expect customer dimension data not to be older than one day, the changes to customer data in the source systems must be applied to the data warehouse daily.
Usefulness. Everydataelementinthedatawarehousemustsatisfysomerequirementsof the collection of users. A data element may be accurate and of high quality, but if it is of no value to the users, then it is totally unnecessary for that data element to be in the data warehouse.
Adherence to Data Integrity Rules. The data stored in the relational databases of the source systems must adhere to entity integrity and referential integrity rules. Any table that permits null as the primary key does not have entity integrity. Referential integrity forces the establishment of the parent–child relationships correctly. In a customer-to-order relationship, referential integrity ensures the existence of a customer for every order in the database.
Domain Integrity. The data value of an attribute falls in the range of allowable, defined values. The common example is the allowable values being “male” and “female” for the gender data element.
Data Type. Value for a data attribute is actually stored as the data type defined for that attribute. When the data type of the store name field is defined as “text,” all instances of that field contain the store name shown in textual format and not numeric codes.
Consistency. Theformandcontentofadatafieldisthesameacrossmultiplesourcesys- tems. If the product code for product ABC in one system is 1234, then the code for this product is 1234 in every source system.
Redundancy. Thesamedatamustnotbestoredinmorethanoneplaceinasystem.If,for reasons of efficiency, a data element is intentionally stored in more than one place in a system, then the redundancy must be clearly identified and verified.
Completeness. There are no missing values for a given attribute in the system. For example, in a customer file, there must be a valid value for the “state” field for every customer. In the file for order details, every detail record for an order must be completely filled.
Duplication. Duplicationofrecordsinasystemiscompletelyresolved.Iftheproductfile is known to have duplicate records, then all the duplicate records for each product are identified and a cross-reference created.
Conformance to Business Rules. The values of each data item adhere to prescribed business rules. In an auction system, the hammer or sale price cannot be less than the reserve price. In a bank loan system, the loan balance must always be positive or zero.
Structural Definiteness. Whereveradataitemcannaturallybestructuredintoindividual components, the item must contain this well-defined structure. For example, an indi- vidual’s name naturally divides into first name, middle initial, and last name. Values for names of individuals must be stored as first name, middle initial, and last name. This characteristic of data quality simplifies enforcement of standards and reduces missing values.
Data Anomaly. A field must be used only for the purpose for which it is defined. If the field Address-3 is defined for any possible third line of address for long addresses, then this field must be used only for recording the third line of address. It must not be used for entering a phone or fax number for the customer.
Clarity. Adataelementmaypossessalltheothercharacteristicsofqualitydatabutifthe users do not understand its meaning clearly, then the data element is of no value to the users. Proper naming conventions help to make the data elements well understood by the users.
Timely. The users determine the timeliness of the data. lf the users expect customer dimension data not to be older than one day, the changes to customer data in the source systems must be applied to the data warehouse daily.
Usefulness. Everydataelementinthedatawarehousemustsatisfysomerequirementsof the collection of users. A data element may be accurate and of high quality, but if it is of no value to the users, then it is totally unnecessary for that data element to be in the data warehouse.
Adherence to Data Integrity Rules. The data stored in the relational databases of the source systems must adhere to entity integrity and referential integrity rules. Any table that permits null as the primary key does not have entity integrity. Referential integrity forces the establishment of the parent–child relationships correctly. In a customer-to-order relationship, referential integrity ensures the existence of a customer for every order in the database.
Tuesday, November 22, 2016
Friday, November 04, 2016
Monday, October 31, 2016
Friday, October 28, 2016
DRY on Microservices
Don't violate DRY within a microservice, but be relaxed about violating DRY across all services. The evils of too much coupling between services are far worse than the problems caused by code duplication.
Tuesday, October 25, 2016
Saturday, October 15, 2016
Stages in a pattern recognition problem
1. Formulation of the problem: gaining a clear understanding of the aims of the investigation and planning the remaining stages.
2. Data collection: making measurements on appropriate variables and recording details of the data collection procedure (ground truth).
3. Initial examination of the data: checking the data, calculating summary statistics and producing plots in order to get a feel for the structure.
4. Feature selection or feature extraction: selecting variables from the measured set that are appropriate for the task. These new variables may be obtained by a linear or nonlinear transformation of the original set (feature extraction). To some extent, the division of feature extraction and classification is artificial.
5. Unsupervised pattern classification or clustering. This may be viewed as exploratory data analysis and it may provide a successful conclusion to a study. On the other hand, it may be a means of preprocessing the data for a supervised classification procedure.
6. Apply discrimination or regression procedures as appropriate. The classifier is de- signed using a training set of exemplar patterns.
7. Assessment of results. This may involve applying the trained classifier to an indepen- dent test set of labelled patterns.
8. Interpretation.
2. Data collection: making measurements on appropriate variables and recording details of the data collection procedure (ground truth).
3. Initial examination of the data: checking the data, calculating summary statistics and producing plots in order to get a feel for the structure.
4. Feature selection or feature extraction: selecting variables from the measured set that are appropriate for the task. These new variables may be obtained by a linear or nonlinear transformation of the original set (feature extraction). To some extent, the division of feature extraction and classification is artificial.
5. Unsupervised pattern classification or clustering. This may be viewed as exploratory data analysis and it may provide a successful conclusion to a study. On the other hand, it may be a means of preprocessing the data for a supervised classification procedure.
6. Apply discrimination or regression procedures as appropriate. The classifier is de- signed using a training set of exemplar patterns.
7. Assessment of results. This may involve applying the trained classifier to an indepen- dent test set of labelled patterns.
8. Interpretation.
Saturday, October 08, 2016
Friday, October 07, 2016
Monday, October 03, 2016
Tree replication problem. The same subtree can appear at different branches.
This makes the decision tree more complex than necessary and perhaps more difficult to interpret. Such a situation can arise from decision tree implementations that rely on a single attribute test condition at each internal node. Since most of the decision tree algorithms use a divide-and-conquer partitioning strategy, the same test condition can be applied to different parts of the attribute space, thus leading to the subtree replication problem.
Wednesday, September 28, 2016
Saturday, September 17, 2016
Friday, September 16, 2016
Thursday, September 08, 2016
Wednesday, September 07, 2016
Sunday, September 04, 2016
Subscribe to:
Posts (Atom)