AI Writing Tools

Explore the best AI Writing Tools — independent reviews, comparisons, pricing and step-by-step how-to guides, curated by Aizhi.

  • Hello World: How to be Human in the Age of the Machine

    Hello World: How to be Human in the Age of the Machine

    Hello World: How to Be Human in the Age of the Machine (also titled Hello World: Being Human in the Age of Algorithms) is a book on the growing influence of algorithms and artificial intelligence (AI) on human life, authored by mathematician and science communicator Hannah Fry. The book examines how algorithms are increasingly shaping decisions in critical areas such as healthcare, transportation, justice, finance, and the arts. == Overview == Fry uses real-world examples, such as driverless cars and predictive policing, to illustrate her points. She emphasizes that algorithms are not inherently objective; they reflect biases embedded in their design and data inputs. While acknowledging their potential to improve efficiency and accuracy, Fry cautions against over-reliance on machines without human judgment. Fry explores moral questions surrounding algorithmic decision-making, such as whether machines can replace human empathy in critical situations. She advocates for greater scrutiny of algorithms to ensure fairness and avoid harmful biases. The book proposes a "cyborg future", where humans work alongside algorithms to enhance decision-making while retaining ultimate control. == Reception == Hello World has been praised for its clarity, engaging storytelling, and balanced perspective. Critics have highlighted Fry's ability to make complex topics accessible to general audiences while raising important questions about technology's impact on society. The book was shortlisted for awards such as the 2018 Baillie Gifford Prize and the Royal Society Science Book Prize.

    Read more →
  • Dan Roth

    Dan Roth

    Dan Roth (Hebrew: דן רוט) is the Eduardo D. Glandt Distinguished Professor of Computer and Information Science at the University of Pennsylvania and the Chief AI Scientist at Oracle. Until June 2024 Roth was a VP and distinguished scientist at AWS AI. In his role at AWS, Roth led over the last three years the scientific effort behind the first-generation Generative AI products from AWS, including Titan Models, Amazon Q efforts, and Bedrock, from inception until they became generally available. Roth got his B.A. summa cum laude in mathematics from the Technion, Israel, and his Ph.D. in computer science from Harvard University in 1995. He taught at the University of Illinois at Urbana-Champaign from 1998 to 2017 before moving to the University of Pennsylvania. == Professional career == Roth is a Fellow of the American Association for the Advancement of Science (AAAS), the Association for Computing Machinery (ACM), the Association for the Advancement of Artificial Intelligence (AAAI), and the Association of Computational Linguistics (ACL). Roth’s research focuses on the computational foundations of intelligent behavior. He develops theories and systems pertaining to intelligent behavior using a unified methodology, at the heart of which is the idea that learning has a central role in intelligence. His work centers around the study of machine learning and inference methods to facilitate natural language understanding. In doing that he has pursued several interrelated lines of work that span multiple aspects of this problem - from fundamental questions in learning and inference and how they interact, to the study of a range of natural language processing (NLP) problems and developing advanced machine learning based tools for natural language applications. Roth has made seminal contribution to the fusion of Learning and Reasoning, Machine Learning with weak, incidental supervision, and to machine learning and inference approaches to natural language understanding. He has written the first paper on zero-shot learning in natural language processing, a 2008 paper by Chang, Ratinov, Roth, and Srikumar that was published at AAAI’08, but the name given to the learning paradigm there was dataless classification. Roth has worked on probabilistic reasoning (including its complexity and probabilistic lifted inference ), Constrained Conditional Models (ILP formulations of NLP problems) and constraints-driven learning, part-based (constellation) methods in object recognition, response based Learning, He has developed NLP and Information extraction tools that are being used broadly by researchers and commercially, including NER, coreference resolution, wikification, SRL, and ESL text correction. Roth is a co-founder of NexLP, Inc., a startup that applies natural language processing and machine learning in the legal and compliance domains. In 2020, NexLP was acquired by Reveal, Inc., an e-discovery software company. He is currently on the scientific advisory board of the Allen Institute for AI.

    Read more →
  • AI Resume Builders: Free vs Paid (2026)

    AI Resume Builders: Free vs Paid (2026)

    Comparing the best AI resume builder? An AI resume builder is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI resume builder slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Is an AI Video Editor Worth It in 2026?

    Is an AI Video Editor Worth It in 2026?

    Shopping for the best AI video editor? An AI video editor is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI video editor slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Text simplification

    Text simplification

    Text simplification is an aspect of natural language processing that involves modifying, organizing, or categorizing existing text to make it easier to understand while retaining its original meaning. This process is essential in today's world, where communication is increasingly complex due to advancements in science, technology, and media. Human languages are inherently intricate, with extensive vocabularies and complex structures that can be challenging for machines to handle efficiently. Researchers have found that semantic compression techniques can help streamline and simplify text by reducing linguistic diversity and simplifying the vocabulary used in a given context. == Example == Text simplification involves modifying complex sentences into simpler ones to enhance readability and comprehension. Siddharthan (2006) provides an example to illustrate this process. The original sentence contains multiple clauses and phrases, which can be broken down into simpler sentences for better understanding. Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents, which precedes the full purchasing agents report that is due out today and gives an indication of what the full report might hold. Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents. The Chicago report precedes the full purchasing agents report. The Chicago report gives an indication of what the full report might hold. The full report is due out today. An approach to text simplification involves lexical simplification via lexical substitution, a process that replaces complex words with simpler synonyms. Identifying complex words is a challenge addressed by machine learning classifiers trained on labeled data. Researchers have found that asking labelers to sort words by complexity levels yields more consistent results than the traditional method of categorizing words as simple or complex.

    Read more →
  • Oren Etzioni

    Oren Etzioni

    Oren Etzioni (born 1964) is Professor Emeritus of Computer Science at the University of Washington, and founding CEO of the Allen Institute for Artificial Intelligence (AI2). Etzioni is a co-founder of Vercept, an AI startup, and founder and CEO of TrueMedia.org, a non-profit dedicated to fighting political deepfakes, which launched in April 2024. He is also the Founder and Technical Director of the AI2 Incubator and a venture partner at the Madrona Venture Group. == Early life and education == Etzioni is the son of Israeli-American intellectual Amitai Etzioni. He was the first student to major in computer science at Harvard University, where he earned a bachelor's degree in 1986. He earned a PhD from Carnegie Mellon University in January, 1991, supervised by Tom M. Mitchell. == University of Washington career == Etzioni joined the University of Washington faculty in 1991, immediately after receiving his PhD. He rose through the ranks to become the Washington Research Foundation Entrepreneurship Professor in Computer Science & Engineering. Etzioni's research has been focused on basic problems in the study of intelligence, machine reading, machine learning and web search. Past projects include Internet Softbots—the study of intelligent agents in the context of real-world software testbeds. In 2003, he started the KnowItAll project for acquiring massive amounts of information from the web. In 2005, he founded and became the director of the university's Turing Center. The center investigated problems in data mining, natural language processing, the Semantic Web and other web search topics. Etzioni coined the term machine reading and helped to create the first commercial comparison shopping agent. He has published over 200 technical papers, and his H-index exceeds 100. == Entrepreneurship == As a faculty member Etzioni was also an active entrepreneur, founding multiple companies and pioneering multiple technologies including MetaCrawler (bought by Infospace), Netbot (bought by Excite in 1997 for $35 million), and ClearForest (bought by Reuters). He founded Farecast, a travel metasearch and price prediction site, which was acquired by Microsoft in 2008 for $115 million. Before founding Farecast, he developed a program originally called Hamlet, that used algorithms to identify patterns in airfare data using data-mining techniques. He also co-founded Decide.com, a website to help consumers make buying decisions using previous price history and recommendations from other users. Decide.com was bought by eBay in September, 2013. Etzioni is also a venture partner at the Madrona Venture Group. He is founder and CEO of TrueMedia.org, a non-profit dedicated to fighting political deepfakes, which launched in April 2024. Etzioni is a co-founder of Vercept, an AI startup formed in 2025. == Founding CEO of AI2 == In September 2013 Etzioni was selected as the Founding CEO of the Allen Institute for Artificial Intelligence by philanthropist Paul G. Allen, and in January 2014 he took a leave of absence from the University of Washington to serve in that role. Etzioni's technical contributions continued at AI2; for example, in 2015, he helped to create the Semantic Scholar search engine. Under Etzioni’s leadership, AI2 grew from zero to over two hundred team members including notable researchers and engineers across several domains of AI. By 2021, its AI2 researchers had published near 700 papers in publications such as AAAI, ACL, CVPR, NeurIPS, and ICLR. Twenty-four of these papers had garnered special-recognition awards. AI2 also offered several key resources and tools to the AI community including the AllenNLP library, Semantic Scholar, and the conservation platforms EarthRanger and Skylight. Ed Lazowska, AI2 Board Member, has stated about Etzioni that he "took the collegial, collaborative culture that he absorbed in his 20+ years as a professor in UW's Allen School and mixed it with the singular focus that drives startups to create an elixir that AI2 folks have been drinking over the last eight years. The result is an exceptional organization of scientists, engineers, and entrepreneurs that's pursuing Paul Allen’s vision of ‘AI for the Common Good’ with extraordinary success.” == Popular press == In addition to his scientific publications, Etzioni has written commentary on AI for The New York Times, Wired, Nature, and other publications. After reading the idea in a book about AI by Brad Smith and Harry Shum, Etzioni has attempted to create an oath for AI practitioners. In 2018, he published what he called a "Hippocratic Oath for artificial intelligence practitioners" in TechCrunch. == Awards and recognition == In 1993, Etzioni received a National Young Investigator Award. In 2003, Etzioni was elected as AAAI Fellow. In 2005, Etzioni received an IJCAI Distinguished Paper Award for "A Probabilistic Model of Redundancy in Information Extraction". In 2007, he received the Robert S. Engelmore Memorial Award. In 2012 Etzioni was featured as GeekWire's "Geek of the Week". In 2013 Etzioni was voted "Geek of the Year" through GeekWire. In 2022, Etzioni received the 2012 ACL Test-of-Time Paper Award. In 2022, Etzioni, along with Ana-Maria Popescu and Henry Kautz, received the ACM Intelligent User Interfaces Most Impact Award for their 2003 paper, "Towards a Theory of Natural Language Interfaces to Databases". == Personal life == Etzioni has three children, and has said in interviews that family is his number one priority. He is married to Ivone Etzioni, and was previously married to Dr. Ruth Etzioni, a biostatistician at the Fred Hutchinson Cancer Center. Outside of his professional career, Etzioni has a wide range of personal interests. He has attended the Burning Man festival, which he described as a valuable way to step outside his comfort zone. His first computer was a TRS-80, and he has described his car’s GPS as his favorite gadget, joking that he has “no sense of direction.” == Selected publications == === Scholarly publications === Etzioni, Oren (July 1994). "A Softbot-based Interface to the Internet" (PDF). Communications of the ACM. Retrieved March 29, 2018. Etzioni, Oren (December 2008). "Open Information Extraction from the Web" (PDF). Communications of the ACM. Retrieved March 29, 2018. Zamir, Oren; Etzioni, Oren (1998). "Web document clustering". Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. ACM. pp. 46–54. doi:10.1145/290941.290956. ISBN 978-1-58113-015-7. S2CID 244069. Zamir, Oren; Etzioni, Oren (May 1999). "Grouper: a dynamic clustering interface to Web search results". Computer Networks. 31 (11–16): 1361–1374. CiteSeerX 10.1.1.31.8216. doi:10.1016/S1389-1286(99)00054-7. S2CID 206134308. Popescu, Ana-Maria; Etzioni, Oren (2005). "Extracting product features and opinions from reviews". Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing - HLT '05. pp. 339–346. doi:10.3115/1220575.1220618. Etzioni, Oren; Cafarella, Michael; Downey, Doug; Popescu, Ana-Maria; Shaked, Tal; Sonderland, Stephen; Weld, Daniel; Yates, Alexander (June 2005). "Unsupervised named-entity extraction from the Web: An experimental study". Artificial Intelligence. 165 (1): 91–134. doi:10.1016/j.artint.2005.03.001. Downey, Doug; Etzioni, Oren; Sonderland, Stephen (July 2010). "Grouper: Analysis of a probabilistic model of redundancy in unsupervised information extraction". Artificial Intelligence. 174 (11): 726–748. CiteSeerX 10.1.1.174.2441. doi:10.1016/j.artint.2010.04.024. === Popular articles === Etzioni, Oren (August 4, 2011). "Web Search Needs a Shakeup" (PDF). Nature. Retrieved November 21, 2019. Etzioni, Oren (December 9, 2014). "AI Won't Exterminate Us – It Will Empower Us". Backchannel. Retrieved March 29, 2018. Etzioni, Oren (February 4, 2016). "To Keep AI Safe -- Use AI". Vox. Retrieved November 21, 2019. Etzioni, Oren (April 8, 2016). "Quora Session with Oren Etzioni". Quora. Retrieved March 29, 2018. Etzioni, Oren (June 15, 2016). "Deep Learning Isn't a Dangerous Magic Genie. It's Just Math". Wired. Retrieved March 29, 2018. Etzioni, Oren (September 20, 2016). "No, the Experts Don't Think Superintelligent AI is a Threat to Humanity". MIT Technology Review. Retrieved November 21, 2019. Etzioni, Oren (July 6, 2017). "Artificial intelligence: AI Zooms in on highly influential citations". Nature. Retrieved March 29, 2018. Etzioni, Oren (September 1, 2017). "How to Regulate Artificial Intelligence". The New York Times. Retrieved March 29, 2018. Etzioni, Oren (November 2, 2017). "Workers Displaced by Automation Should Try A New Job: Caregiver". Wired. Retrieved March 29, 2018. Etzioni, Oren (March 14, 2018). "A Hippocratic Oath for artificial intelligence practitioners". Tech Crunch. Retrieved March 29, 2018. Etzioni, Oren (March 7, 2018). "A 'Manhattan Project' for science research". The Hill. Retrieved November 21, 2019. Etzioni, Ore

    Read more →
  • Top 10 AI Resume Builders Compared (2026)

    Top 10 AI Resume Builders Compared (2026)

    In search of the best AI resume builder? An AI resume builder is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI resume builder slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • AI Text-to-image Tools: Free vs Paid (2026)

    AI Text-to-image Tools: Free vs Paid (2026)

    Shopping for the best AI text-to-image tool? An AI text-to-image tool is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI text-to-image tool slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • ViEWER

    ViEWER

    ViEWER, the Virtual Environment Workbench for Education and Research, is a proprietary, freeware computer program for Microsoft Windows written by researchers at the University of Idaho for the study of visual perception and complex immersive three-dimensional environments. It was created using C++ and OpenGL, and has been used by Dr. Brian Dyre, Dr. Steffen Werner, Dr. Ernesto Bustamante, Dr. Ben Barton, and their undergraduate and graduate researchers in visual perception, signal detection, and child-safety experiments.

    Read more →
  • Best AI Video Editors in 2026

    Best AI Video Editors in 2026

    Shopping for the best AI video editor? An AI video editor is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI video editor slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Top 10 AI Virtual Assistants Compared (2026)

    Top 10 AI Virtual Assistants Compared (2026)

    Looking for the best AI virtual assistant? An AI virtual assistant is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI virtual assistant slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • How to Choose an AI Voice Assistant

    How to Choose an AI Voice Assistant

    Curious about the best AI voice assistant? An AI voice assistant is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI voice assistant slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Local ternary patterns

    Local ternary patterns

    Local ternary patterns (LTP) are an extension of local binary patterns (LBP). Unlike LBP, it does not threshold the pixels into 0 and 1, rather it uses a threshold constant to threshold pixels into three values. Considering k as the threshold constant, c as the value of the center pixel, a neighboring pixel p, the result of threshold is: { 1 , if p > c + k 0 , if p > c − k and p < c + k − 1 if p < c − k {\displaystyle {\begin{cases}1,&{\text{if }}p>c+k\\0,&{\text{if }}p>c-k{\text{ and }}p Read more →

  • Wasserstein GAN

    Wasserstein GAN

    The Wasserstein Generative Adversarial Network (WGAN) is a variant of generative adversarial network (GAN) proposed in 2017 that aims to "improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches". Compared with the original GAN discriminator, the Wasserstein GAN discriminator provides a better learning signal to the generator. This allows the training to be more stable when generator is learning distributions in very high dimensional spaces. == Motivation == === The GAN game === The original GAN method is based on the GAN game, a zero-sum game with 2 players: generator and discriminator. The game is defined over a probability space ( Ω , B , μ r e f ) {\displaystyle (\Omega ,{\mathcal {B}},\mu _{ref})} , The generator's strategy set is the set of all probability measures μ G {\displaystyle \mu _{G}} on ( Ω , B ) {\displaystyle (\Omega ,{\mathcal {B}})} , and the discriminator's strategy set is the set of measurable functions D : Ω → [ 0 , 1 ] {\displaystyle D:\Omega \to [0,1]} . The objective of the game is L ( μ G , D ) := E x ∼ μ r e f [ ln ⁡ D ( x ) ] + E x ∼ μ G [ ln ⁡ ( 1 − D ( x ) ) ] . {\displaystyle L(\mu _{G},D):=\mathbb {E} _{x\sim \mu _{ref}}[\ln D(x)]+\mathbb {E} _{x\sim \mu _{G}}[\ln(1-D(x))].} The generator aims to minimize it, and the discriminator aims to maximize it. A basic theorem of the GAN game states that Repeat the GAN game many times, each time with the generator moving first, and the discriminator moving second. Each time the generator μ G {\displaystyle \mu _{G}} changes, the discriminator must adapt by approaching the ideal D ∗ ( x ) = d μ r e f d ( μ r e f + μ G ) . {\displaystyle D^{}(x)={\frac {d\mu _{ref}}{d(\mu _{ref}+\mu _{G})}}.} Since we are really interested in μ r e f {\displaystyle \mu _{ref}} , the discriminator function D {\displaystyle D} is by itself rather uninteresting. It merely keeps track of the likelihood ratio between the generator distribution and the reference distribution. At equilibrium, the discriminator is just outputting 1 2 {\displaystyle {\frac {1}{2}}} constantly, having given up trying to perceive any difference. Concretely, in the GAN game, let us fix a generator μ G {\displaystyle \mu _{G}} , and improve the discriminator step-by-step, with μ D , t {\displaystyle \mu _{D,t}} being the discriminator at step t {\displaystyle t} . Then we (ideally) have L ( μ G , μ D , 1 ) ≤ L ( μ G , μ D , 2 ) ≤ ⋯ ≤ max μ D L ( μ G , μ D ) = 2 D J S ( μ r e f ‖ μ G ) − 2 ln ⁡ 2 , {\displaystyle L(\mu _{G},\mu _{D,1})\leq L(\mu _{G},\mu _{D,2})\leq \cdots \leq \max _{\mu _{D}}L(\mu _{G},\mu _{D})=2D_{JS}(\mu _{ref}\|\mu _{G})-2\ln 2,} so we see that the discriminator is actually lower-bounding D J S ( μ r e f ‖ μ G ) {\displaystyle D_{JS}(\mu _{ref}\|\mu _{G})} . === Wasserstein distance === Thus, we see that the point of the discriminator is mainly as a critic to provide feedback for the generator, about "how far it is from perfection", where "far" is defined as Jensen–Shannon divergence. Naturally, this brings the possibility of using a different criteria of farness. There are many possible divergences to choose from, such as the f-divergence family, which would give the f-GAN. The Wasserstein GAN is obtained by using the Wasserstein metric, which satisfies a "dual representation theorem" that renders it highly efficient to compute: A proof can be found in the main page on Wasserstein metric. == Definition == By the Kantorovich-Rubenstein duality, the definition of Wasserstein GAN is clear:A Wasserstein GAN game is defined by a probability space ( Ω , B , μ r e f ) {\displaystyle (\Omega ,{\mathcal {B}},\mu _{ref})} , where Ω {\displaystyle \Omega } is a metric space, and a constant K > 0 {\displaystyle K>0} . There are 2 players: generator and discriminator (also called "critic"). The generator's strategy set is the set of all probability measures μ G {\displaystyle \mu _{G}} on ( Ω , B ) {\displaystyle (\Omega ,{\mathcal {B}})} . The discriminator's strategy set is the set of measurable functions of type D : Ω → R {\displaystyle D:\Omega \to \mathbb {R} } with bounded Lipschitz-norm: ‖ D ‖ L ≤ K {\displaystyle \|D\|_{L}\leq K} . The Wasserstein GAN game is a zero-sum game, with objective function L W G A N ( μ G , D ) := E x ∼ μ G [ D ( x ) ] − E x ∼ μ r e f [ D ( x ) ] . {\displaystyle L_{WGAN}(\mu _{G},D):=\mathbb {E} _{x\sim \mu _{G}}[D(x)]-\mathbb {E} _{x\sim \mu _{ref}}[D(x)].} The generator goes first, and the discriminator goes second. The generator aims to minimize the objective, and the discriminator aims to maximize the objective: min μ G max D L W G A N ( μ G , D ) . {\displaystyle \min _{\mu _{G}}\max _{D}L_{WGAN}(\mu _{G},D).} By the Kantorovich-Rubenstein duality, for any generator strategy μ G {\displaystyle \mu _{G}} , the optimal reply by the discriminator is D ∗ {\displaystyle D^{}} , such that L W G A N ( μ G , D ∗ ) = K ⋅ W 1 ( μ G , μ r e f ) . {\displaystyle L_{WGAN}(\mu _{G},D^{})=K\cdot W_{1}(\mu _{G},\mu _{ref}).} Consequently, if the discriminator is good, the generator would be constantly pushed to minimize W 1 ( μ G , μ r e f ) {\displaystyle W_{1}(\mu _{G},\mu _{ref})} , and the optimal strategy for the generator is just μ G = μ r e f {\displaystyle \mu _{G}=\mu _{ref}} , as it should. == Comparison with GAN == In the Wasserstein GAN game, the discriminator provides a better gradient than in the GAN game. Consider for example a game on the real line where both μ G {\displaystyle \mu _{G}} and μ r e f {\displaystyle \mu _{ref}} are Gaussian. Then the optimal Wasserstein critic D W G A N {\displaystyle D_{WGAN}} and the optimal GAN discriminator D {\displaystyle D} are plotted as below: For fixed discriminator, the generator needs to minimize the following objectives: For GAN, E x ∼ μ G [ ln ⁡ ( 1 − D ( x ) ) ] {\displaystyle \mathbb {E} _{x\sim \mu _{G}}[\ln(1-D(x))]} . For Wasserstein GAN, E x ∼ μ G [ D W G A N ( x ) ] {\displaystyle \mathbb {E} _{x\sim \mu _{G}}[D_{WGAN}(x)]} . Let μ G {\displaystyle \mu _{G}} be parametrized by θ {\displaystyle \theta } , then we can perform stochastic gradient descent by using two unbiased estimators of the gradient: ∇ θ E x ∼ μ G [ ln ⁡ ( 1 − D ( x ) ) ] = E x ∼ μ G [ ln ⁡ ( 1 − D ( x ) ) ⋅ ∇ θ ln ⁡ ρ μ G ( x ) ] {\displaystyle \nabla _{\theta }\mathbb {E} _{x\sim \mu _{G}}[\ln(1-D(x))]=\mathbb {E} _{x\sim \mu _{G}}[\ln(1-D(x))\cdot \nabla _{\theta }\ln \rho _{\mu _{G}}(x)]} ∇ θ E x ∼ μ G [ D W G A N ( x ) ] = E x ∼ μ G [ D W G A N ( x ) ⋅ ∇ θ ln ⁡ ρ μ G ( x ) ] {\displaystyle \nabla _{\theta }\mathbb {E} _{x\sim \mu _{G}}[D_{WGAN}(x)]=\mathbb {E} _{x\sim \mu _{G}}[D_{WGAN}(x)\cdot \nabla _{\theta }\ln \rho _{\mu _{G}}(x)]} where we used the reparameterization trick. As shown, the generator in GAN is motivated to let its μ G {\displaystyle \mu _{G}} "slide down the peak" of ln ⁡ ( 1 − D ( x ) ) {\displaystyle \ln(1-D(x))} . Similarly for the generator in Wasserstein GAN. For Wasserstein GAN, D W G A N {\displaystyle D_{WGAN}} has gradient 1 almost everywhere, while for GAN, ln ⁡ ( 1 − D ) {\displaystyle \ln(1-D)} has flat gradient in the middle, and steep gradient elsewhere. As a result, the variance for the estimator in GAN is usually much larger than that in Wasserstein GAN. See also Figure 3 of. The problem with D J S {\displaystyle D_{JS}} is much more severe in actual machine learning situations. Consider training a GAN to generate ImageNet, a collection of photos of size 256-by-256. The space of all such photos is R 256 2 {\displaystyle \mathbb {R} ^{256^{2}}} , and the distribution of ImageNet pictures, μ r e f {\displaystyle \mu _{ref}} , concentrates on a manifold of much lower dimension in it. Consequently, any generator strategy μ G {\displaystyle \mu _{G}} would almost surely be entirely disjoint from μ r e f {\displaystyle \mu _{ref}} , making D J S ( μ G ‖ μ r e f ) = + ∞ {\displaystyle D_{JS}(\mu _{G}\|\mu _{ref})=+\infty } . Thus, a good discriminator can almost perfectly distinguish μ r e f {\displaystyle \mu _{ref}} from μ G {\displaystyle \mu _{G}} , as well as any μ G ′ {\displaystyle \mu _{G}'} close to μ G {\displaystyle \mu _{G}} . Thus, the gradient ∇ μ G L ( μ G , D ) ≈ 0 {\displaystyle \nabla _{\mu _{G}}L(\mu _{G},D)\approx 0} , creating no learning signal for the generator. Detailed theorems can be found in. == Training Wasserstein GANs == Training the generator in Wasserstein GAN is just gradient descent, the same as in GAN (or most deep learning methods), but training the discriminator is different, as the discriminator is now restricted to have bounded Lipschitz norm. There are several methods for this. === Upper-bounding the Lipschitz norm === Let the discriminator function D {\displaystyle D} to be implemented by a multilayer perceptron: D = D n ∘ D n − 1 ∘ ⋯ ∘ D 1 {\displaystyle D=D_{n}\circ D_{n-1}\circ \cdots \circ D_{1}} where D i ( x ) = h ( W i x ) {\displaystyle D_{i}(x)=h(W_

    Read more →
  • Klaus-Robert Müller

    Klaus-Robert Müller

    Klaus-Robert Müller (born 1964 in Karlsruhe, West Germany) is a German computer scientist and physicist, most noted for his work in machine learning and brain–computer interfaces. == Career == Klaus-Robert Müller received his Diplom in mathematical physics and PhD in theoretical computer science from the University of Karlsruhe. Following his Ph.D. he went to Berlin as a postdoctoral fellow at GMD (German National Research Center for Computer Science) Berlin (now part of Fraunhofer Institute for Open Communication Systems), where he started building up the Intelligent Data Analysis (IDA) group. From 1994 to 1995 he was a research fellow at Shun'ichi Amari's lab at the University of Tokyo. 1999 Müller became an associate professor for neuroinformatics at the University of Potsdam, transitioning to the full professorship for Neural Networks and Time Series Analysis in 2003. Since 2006 he holds the chair for Machine Learning at Technische Universität Berlin. Since 2012 he holds a distinguished professorship at Korea University in Seoul. He co-founded and is co-director of the Berlin Big Data Center (BBDC) of TU Berlin. As of 2017, 29 former doctoral or postdoctoral researchers of Klaus-Robert Müller have become full professors themselves. Bernhard Schölkopf and Alexander J. Smola were supervised by him as members of his research group. Since 2020 he is director of the Berlin Institute for the Foundations of Learning and Data (BIFOLD), a German National AI Competence Center, and director of the European Laboratory for Learning and Intelligent Systems (ELLIS) unit Berlin. In 2020/2021 he spent his sabbatical at Google Brain as a principal scientist. == Research == Müller has contributed extensively to several major interests of machine learning, including support vector machines (SVMs) and kernel methods, and artificial neural networks. He pioneered applying new methods of pattern recognition in domains like brain–computer interfaces, using them for patients with Locked-in syndrome. He is one of the leading computer scientists affiliated with Germany. His current research interests include: Statistical learning theory (Support Vector Machines, Deep Neural Networks, Boosting) Learning of non-stationarity data Fusion of structured heterogeneous multi-modal data, co-adaptation Applications: MEG, EEG, NIRS, ECoG, EMG, Brain Computer Interfaces, computational neuroscience, computer vision, genomic data analysis, computational chemistry and atomistic simulations, digital pathology == Honours and awards == Klaus-Robert Müller was elected a fellow of the German National Academy of Sciences Leopoldina in 2012. In 2017 he was elected member of the Berlin-Brandenburg Academy of Sciences and Humanities and also external scientific member of the Max Planck Society. In 2021 he was elected member of the German Academy of Science and Engineering. His work was honoured with several awards, including: 2026 Gottfried Wilhelm Leibniz Prize 2025 IEEE Neural Network Pioneer Award 2024 Feynman Prize in Nanotechnology 2023 Hector Fellow 2025, 2024, 2023, 2022, 2021, 2020, and 2019 Clarivate Highly Cited Researcher 2017 Vodafone Innovations Award 2017 2014 Science Prize of Berlin 2014 by the Governing Mayor of Berlin 2014 European Research Council Panel Consolidator Grants 2009 Best Paper award by IEEE Engineering in Medicine and Biology Society EMBS 2006 SEL-ALCATEL Research Prize for Technical Communication 1999 Olympus Award for Pattern Recognition == Books == with Holzinger, Andreas; et al., eds. (2022). xxAI – Beyond Explainable Artificial Intelligence. Lecture Notes in Computer Science. Vol. 13200. Springer Cham. doi:10.1007/978-3-031-04083-2. ISBN 978-3-031-04082-5. with Schütt, Kristof T.; et al., eds. (2020). Machine Learning Meets Quantum Physics. Lecture Notes in Physics. Vol. 968. Springer Cham. doi:10.1007/978-3-030-40245-7. ISBN 978-3-030-40244-0. S2CID 242406994. with Samek, Wojciech; et al., eds. (2019). Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science. Vol. 11700. Springer Cham. doi:10.1007/978-3-030-28954-6. ISBN 978-3-030-28953-9. with Montavon, Grégoire; et al., eds. (2012). Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science. Vol. 7700 (2nd ed.). Springer Berlin, Heidelberg. doi:10.1007/978-3-642-35289-8. ISBN 978-3-642-35288-1. S2CID 39578794.

    Read more →