Genetic programming (GP) is an evolutionary algorithm, an artificial intelligence technique mimicking natural evolution, which operates on a population of programs. It applies the genetic operators selection according to a predefined fitness measure, mutation and crossover. The crossover operation involves swapping specified parts of selected pairs (parents) to produce new and different offspring that become part of the new generation of programs. Some programs not selected for reproduction are copied from the current generation to the new generation. Mutation involves substitution of some random part of a program with some other random part of a program. Then the selection and other operations are recursively applied to the new generation of programs. Typically, members of each new generation are on average more fit than the members of the previous generation, and the best-of-generation program is often better than the best-of-generation programs from previous generations. Termination of the evolution usually occurs when some individual program reaches a predefined proficiency or fitness level. It may and often does happen that a particular run of the algorithm results in premature convergence to some local maximum that is not a globally optimal or even good solution. Multiple runs (dozens to hundreds) are usually necessary to produce a very good result. It may also be necessary to have a large starting population size and variability of the individuals to avoid pathologies. == History == The first record of the proposal to evolve programs is probably that of Alan Turing in 1950 in "Computing Machinery and Intelligence". There was a gap of 25 years before the publication of John Holland's 'Adaptation in Natural and Artificial Systems' laid out the theoretical and empirical foundations of the science. In 1981, Richard Forsyth demonstrated the successful evolution of small programs, represented as trees, to perform classification of crime scene evidence for the UK Home Office. Although the idea of evolving programs, initially in the computer language Lisp, was current amongst John Holland's students, it was not until they organised the first Genetic Algorithms (GA) conference in Pittsburgh that Nichael Cramer published evolved programs in two specially designed languages, which included the first statement of modern "tree-based" genetic programming (that is, procedural languages organized in tree-based structures and operated on by suitably defined GA-operators). In 1988, John Koza (also a PhD student of John Holland) patented his invention of a GA for program evolution. This was followed by publication in the International Joint Conference on Artificial Intelligence IJCAI-89. Koza followed this with 205 publications on "genetic programming", a term coined by David Goldberg, also a PhD student of John Holland. However, it is the series of 4 books by Koza, starting in 1992 with accompanying videos, that really established GP. Subsequently, there was an enormous expansion of the number of publications with the Genetic Programming Bibliography, surpassing 10,000 entries. In 2010, Koza listed 77 results where genetic programming was human competitive. The departure of GP from the rigid, fixed-length representations typical of early GA models was not entirely without precedent. Early work on variable-length representations laid the groundwork. One notable example is messy genetic algorithms, which introduced irregular, variable-length chromosomes to address building block disruption and positional bias in standard GAs. Another precursor was robot trajectory programming, where genome representations encoded program instructions for robotic movements—structures inherently variable in length. Even earlier, unfixed-length representations were proposed in a doctoral dissertation by Cavicchio, who explored adaptive search using simulated evolution. His work provided foundational ideas for flexible program structures. In 1996, Koza started the annual Genetic Programming conference, which was followed in 1998 by the annual EuroGP conference, and the first book in a GP series edited by Koza. 1998 also saw the first GP textbook. GP continued to flourish, leading to the first specialist GP journal and three years later (2003) the annual Genetic Programming Theory and Practice (GPTP) workshop was established by Rick Riolo. Genetic programming papers continue to be published at a diversity of conferences and associated journals. Today there are nineteen GP books including several for students. === Foundational work in GP === Early work that set the stage for current genetic programming research topics and applications is diverse, and includes software synthesis and repair, predictive modeling, data mining, financial modeling, soft sensors, design, and image processing. Applications in some areas, such as design, often make use of intermediate representations, such as Fred Gruau's cellular encoding. Industrial uptake has been significant in several areas including finance, the chemical industry, bioinformatics and the steel industry. == Methods == === Program representation === GP evolves computer programs, traditionally represented in memory as tree structures. Trees can be easily evaluated in a recursive manner. Every internal node has an operator function and every terminal node has an operand, making mathematical expressions easy to evolve and evaluate. Thus traditionally GP favors the use of programming languages that naturally embody tree structures (for example, Lisp; other functional programming languages are also suitable). Non-tree representations have been suggested and successfully implemented, such as linear genetic programming, which perhaps suits the more traditional imperative languages. The commercial GP software Discipulus uses automatic induction of binary machine code ("AIM") to achieve better performance. μGP uses directed multigraphs to generate programs that fully exploit the syntax of a given assembly language. Multi expression programming uses three-address code for encoding solutions. Other program representations on which significant research and development have been conducted include programs for stack-based virtual machines, and sequences of integers that are mapped to arbitrary programming languages via grammars. Cartesian genetic programming is another form of GP, which uses a graph representation instead of the usual tree based representation to encode computer programs. Most representations have structurally noneffective code (introns). Such non-coding genes may seem to be useless because they have no effect on the performance of any one individual. However, they alter the probabilities of generating different offspring under the variation operators, and thus alter the individual's variational properties. Experiments seem to show faster convergence when using program representations that allow such non-coding genes, compared to program representations that do not have any non-coding genes. Instantiations may have both trees with introns and those without; the latter are called canonical trees. Special canonical crossover operators are introduced that maintain the canonical structure of parents in their children. === Initialisation === The methods for creation of the initial population include: Grow creates the individuals sequentially. Every GP tree is created starting from the root, creating functional nodes with children as well as terminal nodes up to a certain depth. Full is similar to the Grow. The difference is that all brunches in a tree are of same predetermined depth. Ramped half-and-half creates a population consisting of m d − 1 {\displaystyle md-1} parts and a maximum depth of m d {\displaystyle md} for its trees. The first part has a maximum depth of 2, second of 3 and so on up to the m d − 1 {\displaystyle md-1} -th part with maximum depth m d {\displaystyle md} . Half of every part is created by Grow, while the other part is created by Full. === Selection === Selection is a process whereby certain individuals are selected from the current generation that would serve as parents for the next generation. The individuals are selected probabilistically such that the better performing individuals have a higher chance of getting selected. The most commonly used selection method in GP is tournament selection, although other methods such as fitness proportionate selection, lexicase selection, and others have been demonstrated to perform better for many GP problems. Elitism, which involves seeding the next generation with the best individual (or best n individuals) from the current generation, is a technique sometimes employed to avoid regression. === Crossover === In genetic programming two fit individuals are chosen from the population to be parents for one or two children. In tree genetic programming, these parents are represented as inverted lisp like trees, with their root nodes at the top. In subtree cro
Artificial intelligence in hiring
Artificial intelligence can be used to automate aspects of the job recruitment process. Advances in artificial intelligence, such as the advent of machine learning and the growth of big data, enable AI to be utilized to recruit, screen, and predict the success of applicants. Proponents of artificial intelligence in hiring claim it reduces bias, assists with finding qualified candidates, and frees up human resource workers' time for other tasks, while opponents worry that AI perpetuates inequalities in the workplace and will eliminate jobs. Despite the potential benefits, the ethical implications of AI in hiring remain a subject of debate, with concerns about algorithmic transparency, accountability, and the need for ongoing oversight to ensure fair and unbiased decision-making throughout the recruitment process. == Background == It is common for companies to use AI to automate aspects of their hiring process, especially the hospitality, finance, and tech industries. == Uses == === Screeners === Screeners are tests that allow companies to sift through a large applicant pool and extract applicants that have desirable features. What factors are used to screen applicants is a concern to ethicists and civil rights activists. A screener that favors people who have similar characteristics to those already employed at a company may perpetuate inequalities. For example, if a company that is predominantly white and male uses its employees' data to train its screener it may accidentally create a screening process that favors white, male applicants. The automation of screeners also has the potential to reduce biases. Biases against applicants with African American sounding names have been shown in multiple studies. An AI screener has the potential to limit human bias and error in the hiring process, allowing more minority applicants to be successful. === Recruitment === Recruitment involves the identification of potential applicants and the marketing of positions. AI is commonly utilized in the recruitment process because it can help boost the number of qualified applicants for positions. Companies are able to use AI to target their marketing to applicants who are likely to be good fits for a position. This often involves the use of social media sites advertising tools, which rely on AI. Facebook allows advertisers to target ads based on demographics, location, interests, behavior, and connections. Facebook also allows companies to target a "look-a-like" audience, that is the company supplies Facebook with a data set, typically the company's current employees, and Facebook will target the ad to profiles that are similar to the profiles in the data set. Additionally, job sites like Indeed, Glassdoor, and ZipRecruiter target job listings to applicants that have certain characteristics employers are looking for. Targeted advertising has many advantages for companies trying to recruit such being a more efficient use of resources, reaching a desired audience, and boosting qualified applicants. This has helped make it a mainstay in modern hiring. Who receives a targeted ad can be controversial. In hiring, the implications of targeted ads have to do with who is able to find out about and then apply to a position. Most targeted ad algorithms are proprietary information. Some platforms, like Facebook and Google, allow users to see why they were shown a specific ad, but users who do not receive the ad likely never know of its existence and also have no way of knowing why they were not shown the ad. === Interviews === Chatbots were one of the first applications of AI and are commonly used in the hiring process. Interviewees interact with chatbots to answer interview questions, and an analysis of their responses can be generated by AI. HireVue has created technology that analyzes interviewees' responses and gestures during recorded video interviews. Over 12 million interviewees have been screened by the more than 700 companies that utilize the service. == Controversies == Artificial intelligence in hiring confers many benefits, but it also has some challenges that have concerned experts. AI is only as good as the data it is using. Biases can inadvertently be baked into the data used in AI. Often companies will use data from their employees to decide what people to recruit or hire. This can perpetuate bias and lead to more homogenous workforces. Facebook Ads was an example of a platform that created such controversy for allowing business owners to specify what type of employee they are looking for. For example, job advertisements for nursing and teach could be set such that only women of a specific age group would see the advertisements. Facebook Ads has since then removed this function from its platform, citing the potential problems with the function in perpetuating biases and stereotypes against minorities. The growing use of Artificial Intelligence-enabled hiring systems has become an important component of modern talent hiring, particularly through social networks such as LinkedIn and Facebook. However, data overflow embedded in the hiring systems, based on Natural Language Processing (NLP) methods, may result in unconscious gender bias. Utilizing data driven methods may mitigate some bias generated from these systems It can also be hard to quantify what makes a good employee. This poses a challenge for training AI to predict which employees will be best. Commonly used metrics like performance reviews can be subjective and have been shown to favor white employees over black employees and men over women. Another challenge is the limited amount of available data. Employers only collect certain details about candidates during the initial stages of the hiring process. This requires AI to make determinations about candidates with very limited information to go off of. Additionally, many employers do not hire employees frequently and so have limited firm specific data to go off. To combat this, many firms will use algorithms and data from other firms in their industry. AI's reliance on applicant and current employees personal data raises privacy issues. These issues effect both the applicants and current employees, but also may have implications for third parties who are linked through social media to applicants or current employees. For example, a sweep of someone's social media will also show their friends and people they have tagged in photos or posts. == AI and the future of hiring == Artificial intelligence along with other technological advances such as improvements in robotics have placed 47% of jobs at risk of being eliminated in the near future. In 2016 the founder of the World Economic Forum, Klaus Schwab, called AI and related technology the "Fourth Industrial Revolution". According to some scholars, however, the transformative impact of AI on labor has been overstated. The "no-real-change" theory holds that an IT revolution has already occurred, but that the benefits of implementing new technologies does not outweigh the costs associated with adopting them. This theory claims that the result of the IT revolution is thus much less impactful than had originally been forecasted. Other scholars refute this theory claiming that AI has already led to significant job loss for unskilled labor and that it will eliminate middle skill and high skill jobs in the future. This position is based around the idea that AI is not yet a technology of general use and that any potential 4th industrial revolution has not fully occurred. A third theory holds that the effect of AI and other technological advances is too complicated to yet be understood. This theory is centered around the idea that while AI will likely eliminate jobs in the short term it will also likely increase the demand for other jobs. The question then becomes will the new jobs be accessible to people and will they emerge near when jobs are eliminated. == AI use in hiring for candidates == Job seekers now commonly encounter AI-driven tools at multiple stages, including automated resume parsing, video interview analysis, chatbots for frequently asked questions, and real‑time application updates. Some candidates also employ AI career agents, designed to optimize job searches, tailor applications, and interface with hiring teams. A 2025 Australian study found that AI-driven video interviews exhibited transcription error rates of up to 22% for non‑native speakers and those with speech-related disabilities, raising concerns of discrimination. A 2017 study in the Journal of Sociology found persistent gender and racial disparities in AI screening tools, even when fairness interventions are applied. Industry observers describe a growing “AI arms race” in recruitment, where both employers and candidates increasingly rely on automated agents. Employers use recruiting systems to source and filter applicants, while candidates deploy AI agents to prepare and submit applications. == Regulations == The Artifici
Liang Wenfeng
Liang Wenfeng (Chinese: 梁文锋; pinyin: Liáng Wénfēng; born 1985) is a Chinese entrepreneur and businessman who is the co-founder of the quantitative hedge fund High-Flyer, as well as the founder and CEO of its artificial intelligence company DeepSeek. Liang attended Zhejiang University, and began his career by applying machine learning methods to quantitative finance. Through High-Flyer, he built large-scale computing infrastructure that was later used to support artificial intelligence research, leading to the creation of DeepSeek in 2023. DeepSeek gained international attention following the release of DeepSeek-R1, which analysts described as demonstrating high-level performance with comparatively limited compute resources. In 2025, Liang was named to Time magazine's list of 100 Most Influential People in AI and Fortune's list of the Most Powerful People in Business. == Early life == Liang was born in 1985 in the village of Mililing (米历岭村), Qinba town (覃巴镇), Wuchuan city (吴川市), Guangdong. His parents were both primary school teachers. Liang was routinely praised by both locals and teachers alike. Even since middle school, Liang was recalled for being well-known for reading comic books, while also being very proficient in mathematics. == Education == After elementary school, Liang attended Wuchuan No. 1 Middle School. There, he quickly excelled in class and ranked highly amongst his peers. He taught himself high school and university-level mathematics courses. Liang then attended Wuchaun No. 1 High School. In these years, he developed hobbies of mathematical modeling and conducting research projects. Compared to his peers, he was always ranked highly. For every mathematics exam, he always ranked within the top three. He was also the top scorer in the Zhanjiang region of Guangdong for the college entrance exam. Thus, in 2002, Liang left high school early to further pursue his education at the university level at the young age of 17. Attending Zhejiang University at the age of 17, Liang earned a Bachelor of Engineering in Electronic Information Engineering in 2007 and his Master of Engineering in Information & Communication Engineering in 2010. His master's dissertation was titled "Study on Object Tracking Algorithm Based on Low-Cost PTZ camera" (基于低成本PTZ摄像机的目标跟踪算法研究). In his college years, DJI founder Wang Tao asked Liang to join as a co-founder. Liang declined the invitation to pursue artificial intelligence methodologies in financial markets. While he states that those around him had entrepreneurial mindsets, he himself valued academics. == Career == === Early career (2008–2016) === During the 2008 financial crisis, Liang formed a team with his classmates to accumulate data related to financial markets. He also led the team to explore quantitative trading using machine learning and other technologies. After his graduation, Liang moved to a cheap flat in Chengdu, Sichuan, where he experimented with ways to apply AI to various fields. These ventures failed, until he tried applying AI to finance. In 2013, Liang attempted to integrate artificial intelligence with quantitative trading and founded Hangzhou Yakebi Investment Management Co Ltd with Xu Jin, an alumnus of Zhejiang University. In 2015, they co-founded Hangzhou Huanfang Technology Co Ltd, which is today's Zhejiang Jiuzhang Asset Management Co Ltd. === High-Flyer (2016–2023) === In February 2016, Liang and two other engineering classmates co-founded Ningbo High-Flyer Quantitative Investment Management Partnership (Limited Partnership). The team relied on mathematics and AI to make investments. Much of the early startup culture was described by former employees to be "geeky" and "quirky," often seen as contrary to the existing culture in large Chinese tech companies. In 2019, Liang founded High-Flyer AI which was dedicated to research on AI algorithms and its basic applications. By this time, High-Flyer had over 10 billion yuan in assets under management. On 30 August 2019, Liang Wenfeng delivered a keynote speech entitled "The Future of Quantitative Investment in China from a Programmer's Perspective" at the Private Equity Golden Bull Award ceremony held by China Securities Journal, and sparked heated discussions. Liang stated that the criterion for determining what is quantitative or non-quantitative is whether the investment decision is made by quantitative methods or by people. Quantitative funds do not have portfolio managers making the decisions and instead are just servers. He also stated High-Flyer's mission is to improve the effectiveness of China's secondary market. In February 2021, Gregory Zuckerman's book The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution was published. Liang wrote the preface for the Chinese edition of the book where he stated that whenever he encountered difficulties at work, he would think of Simons' words "There must be a way to model prices". In January 2025, Zuckerman wrote in The Wall Street Journal where he acknowledged this fact and stated he has been trying to get in touch with Liang but much like Simons, Liang is very secretive and difficult to contact. During 2021, Liang started buying thousands of Nvidia GPUs for his AI side project while running High-Flyer. Liang wanted to build something and it will be a game changer which his business partners thought was only possible from giants such as ByteDance and Alibaba Group. === DeepSeek (since 2023) === ==== DeepSeek begins ==== In May 2023, Liang announced High-Flyer would pursue the development of artificial general intelligence and launched DeepSeek. During that month in an interview with 36Kr, Liang stated that High-Flyer had acquired 10,000 Nvidia A100 GPUs before the US government imposed AI chip restrictions on China. That laid the foundation for DeepSeek to operate as an LLM developer. Liang also stated DeepSeek gets funding from High-Flyer. This was because when DeepSeek was founded, venture capital firms were reluctant in providing funding as it was unlikely that it would be able to generate an exit in a short period of time. Liang only personally holds 1% of the company, with 99% of the company being held by Ningbo High-Flyer Quantitative Investment Management Partnership (Limited Partnership). With DeepSeek's funding model, it lacks commercial pressure and rigid key performance indicators, enabling the company to deviate from previously established model architectures. ==== Early development ==== In July 2024, Liang was interviewed again by 36Kr. He stated that when DeepSeek-V2 was released and triggered an AI price war in China, it came as a huge surprise as the team did not expect pricing to be so sensitive. Liang's aggressive pricing of the language model forced domestic tech giants including Alibaba and Baidu to cut their own rates by over 95%. He also stated that as China's economy develops, it should gradually become a contributor instead of freeriding. What is lacking in China's innovation is not capital but a lack of confidence and knowledge on organizing talent into it. DeepSeek has not hired anyone particularly special and employees tend to be locally educated. When it comes to disruptive technologies, closed source approaches can only temporarily delay others in catching up. As the goal was long-term, DeepSeek sought employees who had ability and passion rather than experience. To retain a high talent density relative to larger firms like Bytedance or Baidu, DeepSeek aimed to maintain a low-hierarchy corporate culture, with members working in project-based groups, as well as competitive compensation. Liang emphasized his vision for DeepSeek employees to bring their "unique experience and ideas" instead of needing to be explicitly directed, with an overall bottom-up approach to division of labor. Liang noted that a significant outcome of this approach was the multi-head latent attention training architecture, which was attributed directly to a young DeepSeek researcher's personal interest. This advancement played a core role in reducing the cost of training the DeepSeek-V3 model, released in December 2024. ==== Release of DeepSeek-R1 ==== Also on 20 January 2025, DeepSeek, the company Liang founded and served as the CEO, released DeepSeek-R1, a 671-billion-parameter open-source reasoning AI model, alongside the publication of a detailed technical paper explaining its architecture and training methodology. The model was built using just 2,048 Nvidia H800 GPUs at a cost of $5.6 million, showcasing a resource-efficient approach that contrasted sharply with the billion-dollar budgets of Western competitors. The development of DeepSeek-R1 occurred amidst U.S. sanctions where Trump limited sales of Nvidia chips to China. By 27 January, DeepSeek surpassed ChatGPT to become the #1 free app on the United States iOS App Store. U.S. stocks plummeted, as more than $1 trillion was erased in market capitalization amid panic over DeepSeek. Technology journ
Minion (solver)
Minion is a solver for satisfaction problems. Unlike constraint programming toolkits, which expect users to write programs in a traditional programming language like C++, Java or Prolog, Minion takes a text file which specifies the problem, and solves using only this. This makes using Minion much simpler, at the cost of much less customization. Minion has been shown to be faster than major commercial constraint solvers including CPLEX (formerly IBM ILOG). == Overview == Minion was introduced in 2006 by researchers at the University of St Andrews as a “fast, scalable” solver for large and hard CSP instances. The project provides a compact input language and a low-overhead C++ implementation aimed at throughput and memory efficiency. == Design and features == Minion implements a range of variable and constraint types commonly used in CSP modelling, plus search heuristics and optimisation support. The solver architecture prioritises cache-friendly data structures and specialised propagators. Notably, the developers adapted watched literal techniques from SAT solving to speed up constraint propagation for, among others, Boolean sums, the element global constraint, and table constraints. The modelling approach relies on a plain-text format (parsed by Minion) rather than embedding models into a host programming language. This reduces overhead and supports rapid “model-and-run” experimentation for large benchmark sets. == Performance == In the original evaluation on standard benchmarks, the authors reported that Minion often ran between one and two orders of magnitude faster than state-of-the-art toolkits of the time (including ILOG Solver and Gecode) on large, hard instances, with smaller gains—or slowdowns—on easier problems. Subsequent research has used Minion as a baseline solver in empirical studies and test generation tasks, reflecting its adoption within parts of the constraint programming community. == Applications == Minion has been applied in academic work on combinatorial search, scheduling and test generation, and is available to other environments via wrappers (for example, from the R language).
Qualification problem
In philosophy and AI (especially, knowledge-based systems), the qualification problem is concerned with the impossibility of listing all the preconditions required for a real-world action to have its intended effect. It might be posed as how to deal with the things that prevent me from achieving my intended result. It is strongly connected to, and opposite the ramification side of, the frame problem. John McCarthy gives the following motivating example, in which it is impossible to enumerate all the circumstances that may prevent a robot from performing its ordinary function: [T]he successful use of a boat to cross a river requires, if the boat is a rowboat, that the oars and rowlocks be present and unbroken, and that they fit each other. Many other qualifications can be added, making the rules for using a rowboat almost impossible to apply, and yet anyone will still be able to think of additional requirements not yet stated.
Rademacher complexity
In computational learning theory (machine learning and theory of computation), Rademacher complexity, named after Hans Rademacher, measures richness of a class of sets with respect to a probability distribution. The concept can also be extended to real valued functions. == Definitions == === Rademacher complexity of a set === Given a set A ⊆ R m {\displaystyle A\subseteq \mathbb {R} ^{m}} , the Rademacher complexity of A is defined as follows: Rad ( A ) := 1 m E σ [ sup a ∈ A ∑ i = 1 m σ i a i ] {\displaystyle \operatorname {Rad} (A):={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{a\in A}\sum _{i=1}^{m}\sigma _{i}a_{i}\right]} where σ 1 , σ 2 , … , σ m {\displaystyle \sigma _{1},\sigma _{2},\dots ,\sigma _{m}} are independent random variables drawn from the Rademacher distribution i.e. Pr ( σ i = + 1 ) = Pr ( σ i = − 1 ) = 1 / 2 {\displaystyle \Pr(\sigma _{i}=+1)=\Pr(\sigma _{i}=-1)=1/2} for i ∈ { 1 , 2 , … , m } {\displaystyle i\in \{1,2,\dots ,m\}} , and a = ( a 1 , … , a m ) ∈ A {\displaystyle a=(a_{1},\ldots ,a_{m})\in A} . Some authors take the absolute value of the sum before taking the supremum, but if A {\displaystyle A} is symmetric this makes no difference. === Rademacher complexity of a function class === Let S = { z 1 , z 2 , … , z m } ⊆ Z {\displaystyle S=\{z_{1},z_{2},\dots ,z_{m}\}\subseteq Z} be a sample of points and consider a function class F {\displaystyle {\mathcal {F}}} of real-valued functions over Z {\displaystyle Z} . Then, the empirical Rademacher complexity of F {\displaystyle {\mathcal {F}}} given S {\displaystyle S} is defined as: Rad S ( F ) = 1 m E σ [ sup f ∈ F | ∑ i = 1 m σ i f ( z i ) | ] {\displaystyle \operatorname {Rad} _{S}({\mathcal {F}})={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{f\in {\mathcal {F}}}\left|\sum _{i=1}^{m}\sigma _{i}f(z_{i})\right|\right]} This can also be written using the previous definition: Rad S ( F ) = Rad ( F ∘ S ) {\displaystyle \operatorname {Rad} _{S}({\mathcal {F}})=\operatorname {Rad} ({\mathcal {F}}\circ S)} where F ∘ S {\displaystyle {\mathcal {F}}\circ S} denotes function composition, i.e.: F ∘ S := { ( f ( z 1 ) , … , f ( z m ) ) ∣ f ∈ F } {\displaystyle {\mathcal {F}}\circ S:=\{(f(z_{1}),\ldots ,f(z_{m}))\mid f\in {\mathcal {F}}\}} The worst case empirical Rademacher complexity is Rad ¯ m ( F ) = sup S = { z 1 , … , z m } Rad S ( F ) {\displaystyle {\overline {\operatorname {Rad} }}_{m}({\mathcal {F}})=\sup _{S=\{z_{1},\dots ,z_{m}\}}\operatorname {Rad} _{S}({\mathcal {F}})} Let P {\displaystyle P} be a probability distribution over Z {\displaystyle Z} . The Rademacher complexity of the function class F {\displaystyle {\mathcal {F}}} with respect to P {\displaystyle P} for sample size m {\displaystyle m} is: Rad P , m ( F ) := E S ∼ P m [ Rad S ( F ) ] {\displaystyle \operatorname {Rad} _{P,m}({\mathcal {F}}):=\mathbb {E} _{S\sim P^{m}}\left[\operatorname {Rad} _{S}({\mathcal {F}})\right]} where the above expectation is taken over an identically independently distributed (i.i.d.) sample S = ( z 1 , z 2 , … , z m ) {\displaystyle S=(z_{1},z_{2},\dots ,z_{m})} generated according to P {\displaystyle P} . == Intuition == The Rademacher complexity is typically applied on a function class of models that are used for classification, with the goal of measuring their ability to classify points drawn from a probability space under arbitrary labellings. When the function class is rich enough, it contains functions that can appropriately adapt for each arrangement of labels, simulated by the random draw of σ i {\displaystyle \sigma _{i}} under the expectation, so that this quantity in the sum is maximized. The Rademacher complexity of a set A {\displaystyle A} can be rewritten as Rad ( A ) := 1 m E σ [ sup a ∈ A ∑ i = 1 m σ i a i ] = 1 m 2 m ∑ σ ∈ { − 1 / m , + 1 / m } m [ sup a ∈ A ⟨ σ , a ⟩ ] . {\displaystyle \operatorname {Rad} (A):={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{a\in A}\sum _{i=1}^{m}\sigma _{i}a_{i}\right]={\frac {1}{{\sqrt {m}}2^{m}}}\sum _{\sigma \in \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}}\left[\sup _{a\in A}\langle \sigma ,a\rangle \right].} Each term in the summation is the farthest distance of the set A {\displaystyle A} from the origin, along a unit-length direction σ {\displaystyle \sigma } . The directions are along the vertices of a hypercube. Thus, we can also write it as Rad ( A ) = 1 2 m 1 2 m − 1 ∑ σ ∈ { − 1 / m , + 1 / m } m / { − 1 , + 1 } [ sup a ∈ A ⟨ σ , a ⟩ − inf a ∈ A ⟨ σ , a ⟩ ] {\displaystyle \operatorname {Rad} (A)={\frac {1}{2{\sqrt {m}}}}{\frac {1}{2^{m-1}}}\sum _{\sigma \in \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}/\{-1,+1\}}\left[\sup _{a\in A}\langle \sigma ,a\rangle -\inf _{a\in A}\langle \sigma ,a\rangle \right]} Here, the set { − 1 / m , + 1 / m } m / { − 1 , + 1 } {\displaystyle \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}/\{-1,+1\}} denotes half of the vertices of a hypercube, selected so that each diagonal has exactly one vertex selected. In words, this states that 2 m Rad ( A ) {\displaystyle 2{\sqrt {m}}\operatorname {Rad} (A)} is precisely the average width of the set A {\displaystyle A} along all diagonal directions of a hypercube. == Examples == A singleton set has 0 width in any direction, so it has Rademacher complexity 0. The set A = { ( 1 , 1 ) , ( 1 , 2 ) } ⊆ R 2 {\displaystyle A=\{(1,1),(1,2)\}\subseteq \mathbb {R} ^{2}} has average width 1 / 2 {\displaystyle 1/{\sqrt {2}}} along the two diagonal directions of the square, so it has Rademacher complexity 1 / 4 {\displaystyle 1/4} . The unit cube [ 0 , 1 ] m {\displaystyle [0,1]^{m}} has constant width m {\displaystyle {\sqrt {m}}} along the diagonal directions, so it has Rademacher complexity 1 / 2 {\displaystyle 1/2} . Similarly, the unit cross-polytope { x ∈ R m : ‖ x ‖ 1 ≤ 1 } {\displaystyle \{x\in \mathbb {R} ^{m}:\|x\|_{1}\leq 1\}} has constant width 2 / m {\displaystyle 2/{\sqrt {m}}} along the diagonal directions, so it has Rademacher complexity 1 / m {\displaystyle 1/m} . == Using the Rademacher complexity == The Rademacher complexity can be used to derive data-dependent upper-bounds on the learnability of function classes. Intuitively, a function-class with smaller Rademacher complexity is easier to learn. === Bounding the representativeness === In machine learning, it is desired to have a training set that represents the true distribution of some sample data S {\displaystyle S} . This can be quantified using the notion of representativeness. Denote by P {\displaystyle P} the probability distribution from which the samples are drawn. Denote by H {\displaystyle H} the set of hypotheses (potential classifiers) and denote by F {\displaystyle {\mathcal {F}}} the corresponding set of error functions, i.e., for every hypothesis h ∈ H {\displaystyle h\in H} , there is a function f h ∈ F {\displaystyle f_{h}\in F} , that maps each training sample (features,label) to the error of the classifier h {\displaystyle h} (note in this case hypothesis and classifier are used interchangeably). For example, in the case that h {\displaystyle h} represents a binary classifier, the error function is a 0–1 loss function, i.e. the error function f h {\displaystyle f_{h}} returns 0 if h {\displaystyle h} correctly classifies a sample and 1 else. We omit the index and write f {\displaystyle f} instead of f h {\displaystyle f_{h}} when the underlying hypothesis is irrelevant. Define: L P ( f ) := E z ∼ P [ f ( z ) ] {\displaystyle L_{P}(f):=\mathbb {E} _{z\sim P}[f(z)]} – the expected error of some error function f ∈ F {\displaystyle f\in {\mathcal {F}}} on the real distribution P {\displaystyle P} ; L S ( f ) := 1 m ∑ i = 1 m f ( z i ) {\displaystyle L_{S}(f):={1 \over m}\sum _{i=1}^{m}f(z_{i})} – the estimated error of some error function f ∈ F {\displaystyle f\in {\mathcal {F}}} on the sample S {\displaystyle S} . The representativeness of the sample S {\displaystyle S} , with respect to P {\displaystyle P} and F {\displaystyle {\mathcal {F}}} , is defined as: Rep P ( F , S ) := sup f ∈ F ( L P ( f ) − L S ( f ) ) {\displaystyle \operatorname {Rep} _{P}({\mathcal {F}},S):=\sup _{f\in F}(L_{P}(f)-L_{S}(f))} Smaller representativeness is better, since it provides a way to avoid overfitting: it means that the true error of a classifier is not much higher than its estimated error, and so selecting a classifier that has low estimated error will ensure that the true error is also low. Note however that the concept of representativeness is relative and hence can not be compared across distinct samples. The expected representativeness of a sample can be bounded above by the Rademacher complexity of the function class: If F {\displaystyle {\mathcal {F}}} is a set of functions with range within [ 0 , 1 ] {\displaystyle [0,1]} , then Rad P , m ( F ) − ln 2 2 m ≤ E S ∼ P m [ Rep P ( F , S ) ] ≤ 2 Rad P , m ( F ) {\displaystyle \operatorname {Rad} _{P,m}({\mathcal {F}})-{\sqrt {\frac {\ln 2}{2m}}}\leq \mathbb {E} _{S\sim P^{m}}[\operatorname {Rep} _{P}({\
Microelectronics and Computer Technology Corporation
Microelectronics and Computer Technology Corporation, originally the Microelectronics and Computer Consortium and widely seen by the acronym MCC, was the first, and at one time one of the largest, computer industry research and development consortia in the United States. MCC ceased operations in 2000 and was formally dissolved in 2004. == Divisions == MCC did research and development in the following areas: [1] System Architecture and Design (optimise hardware and software design, provide for scalability and interoperability, allow rapid prototyping for improved time-to-market, and support the re-engineering of existing systems for open systems). Advanced Microelectronics Packaging and Interconnection (smaller, faster, more powerful, and cost-competitive). Hardware Systems Engineering (tools and methodologies for cost-efficient, up-front design of advanced electronic systems, including modelling and design-for-test techniques to improve cost, yield, quality, and time-to-market). Environmentally Conscious Technologies (process control and optimisation tools, information management and analysis capabilities, and non-hazardous material alternatives supporting cost-efficient production, waste minimisation, and reduced environmental impact). Distributed Information Technology (managing and maintaining physically distributed corporate information resources on different platforms, building blocks for the national information infrastructure, networking tools and services for integration within and between companies, and electronic commerce). Intelligent Systems (systems that "intelligently" support business processes and enhance performance, including decision support, data management, forecasting and prediction). == History == The MCC was a response to the announcement of Japan's Fifth Generation Project, a large Japanese research project launched in 1982 aimed at producing a new kind of computer by 1991. The Japanese had formed similar industrial research consortia as early as 1956.[2] Many European and American computer companies saw this new Japanese initiative as an attempt to take full control of the world's high-end computer market, and MCC was created, in part, as a defensive move against that threat. In late 1982, several major computer and semiconductor manufacturers in the United States banded together and founded MCC under the leadership of Admiral Bobby Ray Inman, whose previous positions had been Director of the National Security Agency and deputy director of the Central Intelligence Agency. Such formations were illegal in the United States until the 1984 Congressional passage of the "National Cooperative Research Act". Several sites with relevant universities were considered, including Atlanta, Georgia (Georgia Tech), the Research Triangle, N.C. (UNC), the Washington, D.C. area (George Mason), Stanford University and Austin, Texas (UT) which was the final selection. The University of Texas offered land upon which they would construct a new building specifically designed for the MCC within their Austin campus. Ross Perot also offered the use of his private plane for 2 years for staff recruitment. Austin was selected as the site for MCC in 1983. Despite this purpose and the background of Inman and his senior staff, MCC accepted no government funding for many years and was a refuge for some avoiding work on Strategic Defense Initiative projects. MCC was part of the Artificial Intelligence boom of the 1980s, reportedly the single largest customer of both Symbolics and Lisp Machines, Inc. (and like Symbolics, was one of the first companies to register a .com domain). In the 1980s its major programs were packaging, software engineering, CAD, and advanced computer architectures. The latter comprised artificial intelligence, human interface, database, and parallel processing, the latter two merging in the late 1980s. Many of the early shareholder companies were mainframe computer companies under stress in the 1980s. Over the years, MCC's membership diversified to include a broad range of high-profile corporations involved in information technology products, as well as government research and development agencies and leading universities. In June, 2000 the MCC Board of Directors voted to dissolve the consortium, and the few remaining employees held a wake at Scholz's Beer Garden in Austin on October 25. Formal dissolution papers were reportedly not filed until 2004. == Spinoffs == While multiple technologies were transferred to member companies and government agencies in the final years, fourteen companies were spun out of MCC. Those spinoffs include: TeraVicta Technologies, Austin's first MEMS company; its focus was to develop microscopic switch technology for fiber optic switching and radiofrequency switching in mobile phones specifically to dynamically switch between the future 3G-4GLTE-future5G wireless communication frequencies and ensure mobile phones were communicating over the strongest wireless signal to reduce dropped calls. Robert Miracky was the founding CEO who spun out the first commercial metal micromachining technology developed by MCC researchers Brent Lunceford, Jason Reed, Richard Nelson, K.Hu, and C. Hilbert in a collaborative development program with IBM in a novel implementation and operational paradigm for solid-state integrated circuit coolers integrated with conductive MEMS switches. TeraVicta was liquidated under Chapter 7 bankruptcy proceedings in 2015. The Austin region subsequently built up a MEMS & Sensors value chain in the billions of dollars comprising companies such as 3M, Cypress Semiconductor, NXP Semiconductor, Cirrus Logic, Silicon Labs, and the Austin division of the now-defunct Silicon Valley Technology Center. Portelligent, a company that provides reverse engineering teardown services. At the time, Portelligent was the first company to commercialize such services; they had been provided by MCC to its member companies. Today, there are at least twelve companies worldwide that sell reports known as "reverse engineering teardown reports." Modern day teardown reports provide detailed information about technology products such as the bill of materials, microchip, and printed circuit board design specifics, manufacturing details including manufacturing location details for the entire value chain responsible for making electronics, including the iPhone and Samsung Galaxy smartphones. Portelligent was acquired by CMP Technology in 2007. Evolutionary Technologies International, a company focused on developing database tools and data warehousing. It was spun off from MCC in 1990.