AI Coding Kya Hota Hai

AI Coding Kya Hota Hai — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Knowledge graph embedding

    Knowledge graph embedding

    In representation learning, knowledge graph embedding (KGE), also called knowledge representation learning (KRL), or multi-relation learning, is a machine learning task of learning a low-dimensional representation of a knowledge graph's entities and relations while preserving their semantic meaning. Leveraging their embedded representation, knowledge graphs can be used for various applications such as link prediction, triple classification, entity recognition, clustering, and relation extraction. == Definition == A knowledge graph G = { E , R , F } {\displaystyle {\mathcal {G}}=\{E,R,F\}} is a collection of entities E {\displaystyle E} , relations R {\displaystyle R} , and facts F {\displaystyle F} . A fact is a triple ( h , r , t ) ∈ F {\displaystyle (h,r,t)\in F} that denotes a link r ∈ R {\displaystyle r\in R} between the head h ∈ E {\displaystyle h\in E} and the tail t ∈ E {\displaystyle t\in E} of the triple. Another notation that is often used in the literature to represent a triple (or fact) is ⟨ head , relation , tail ⟩ {\displaystyle \langle {\text{head}},{\text{relation}},{\text{tail}}\rangle } . This notation is called the Resource Description Framework (RDF). A knowledge graph represents the knowledge related to a specific domain; leveraging this structured representation, it is possible to infer a piece of new knowledge from it after some refinement steps. However, nowadays, people have to deal with the sparsity of data and the computational inefficiency to use them in a real-world application. The embedding of a knowledge graph is a function that translates each entity and each relation into a vector of a given dimension d {\displaystyle d} , called embedding dimension. It is even possible to embed the entities and relations with different dimensions. The embedding vectors can then be used for other tasks. A knowledge graph embedding is characterized by four aspects: Representation space: The low-dimensional space in which the entities and relations are represented. Scoring function: A measure of the goodness of a triple-embedded representation. Encoding models: The modality in which the embedded representation of the entities and relations interact with each other. Additional information: Any additional information coming from the knowledge graph that can enrich the embedded representation. Usually, an ad hoc scoring function is integrated into the general scoring function for each additional piece of information. == Embedding procedure == All algorithms for creating a knowledge graph embedding follow the same approach. First, the embedding vectors are initialized to random values. Then, they are iteratively optimized using a training set of triples. In each iteration, a batch of size b {\displaystyle b} triples is sampled from the training set, and a triple from it is sampled and corrupted—i.e., a triple that does not represent a true fact in the knowledge graph. The corruption of a triple involves substituting the head or the tail (or both) of the triple with another entity that makes the fact false. The original triple and the corrupted triple are added in the training batch, and then the embeddings are updated, optimizing a scoring function. Iteration stops when a stop condition is reached. Usually, the stop condition depends on the overfitting of the training set. At the end, the learned embeddings should have extracted semantic meaning from the training triples and should correctly predict unseen true facts in the knowledge graph. === Pseudocode === The following is the pseudocode for the general embedding procedure. algorithm Compute entity and relation embeddings input: The training set S = { ( h , r , t ) } {\displaystyle S=\{(h,r,t)\}} , entity set E {\displaystyle E} , relation set R {\displaystyle R} , embedding dimension k {\displaystyle k} output: Entity and relation embeddings initialization: the entities e {\displaystyle e} and relations r {\displaystyle r} embeddings (vectors) are randomly initialized while stop condition do S b a t c h ← s a m p l e ( S , b ) {\displaystyle S_{batch}\leftarrow sample(S,b)} // Sample a batch from the training set for each ( h , r , t ) {\displaystyle (h,r,t)} in S b a t c h {\displaystyle S_{batch}} do ( h ′ , r , t ′ ) ← s a m p l e ( S ′ ) {\displaystyle (h',r,t')\leftarrow sample(S')} // Sample a corrupted fact T b a t c h ← T b a t c h ∪ { ( ( h , r , t ) , ( h ′ , r , t ′ ) ) } {\displaystyle T_{batch}\leftarrow T_{batch}\cup \{((h,r,t),(h',r,t'))\}} end for Update embeddings by minimizing the loss function end while == Performance indicators == These indexes are often used to measure the embedding quality of a model. The simplicity of the indexes makes them very suitable for evaluating the performance of an embedding algorithm even on a large scale. Given Q {\displaystyle {\ce {Q}}} as the set of all ranked predictions of a model, it is possible to define three different performance indexes: Hits@K, MR, and MRR. === Hits@K === Hits@K or in short, H@K, is a performance index that measures the probability to find the correct prediction in the first top K model predictions. Usually, it is used k = 10 {\displaystyle k=10} . Hits@K reflects the accuracy of an embedding model to predict the relation between two given triples correctly. Hits@K = | { q ∈ Q : q < k } | | Q | ∈ [ 0 , 1 ] {\displaystyle ={\frac {|\{q\in Q:q Read more →

  • How to Choose an AI Background Remover

    How to Choose an AI Background Remover

    Shopping for the best AI background remover? An AI background remover is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI background remover slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Baidu Fanyi

    Baidu Fanyi

    Baidu Fanyi is a service for translating text paragraphs and web pages provided by Baidu. In 2015, Baidu Translation won the second prize of China's National Science and Technology Progress Award. == Supported languages == Baidu translate has some languages that are missing from Google Translate, such as Cornish, albeit some of them are poor quality. As of June 2026, translation is available in 201 languages:

    Read more →
  • András Kornai

    András Kornai

    András Kornai (born 1957 in Budapest) is a mathematical linguist. == Education == Kornai is the son of economist János Kornai. He earned two PhDs with the first being in mathematics in 1983 from Eötvös Loránd University in Budapest, where his advisor was Miklós Ajtai. His second was in linguistics in 1991 from Stanford University, where his advisor was Paul Kiparsky. == Career == He is a professor in the Department of Algebra at the Budapest Institute of Technology, where he works on an open source Hungarian morphological analyzer. He was Chief Scientist at MetaCarta, where he worked on information extraction before the company was acquired by Nokia. Prior to MetaCarta, he was Chief Scientist at Northern Light. He is on the board of the journal Grammars and YourAmigo PLC. His research interests include all mathematical aspects of natural language processing, speech recognition, and OCR. As area editor he was responsible for the Mathematical Linguistics area of the Oxford International Encyclopedia of Linguistics, and his joint work with Geoffrey Pullum, "The X-bar Theory of Phrase Structure", formally reconstructed that then-popular linguistic theory. == Awards and honors == 2009: ACM Distinguished Member == Monographs == Semantics. Springer Nature, 2020. ISBN 978-3-319-65644-1 Mathematical Linguistics. Springer Verlag, in the series Advanced Information and Knowledge Processing, November 2007. ISBN 978-1-84628-985-9 Hardbound, approximately 300 pages. See description. Formal Phonology. In the series Outstanding Dissertations in Linguistics, Garland Publishing, 1994, ISBN 0-8153-1730-1, hardbound, 240 pages Contents, Preface, Introduction (20 pages) On Hungarian Morphology. In the series Linguistica, Hungarian Academy of Sciences, 1994, ISBN 963-8461-73-X, paperbound, 174 pages Contents, Preface, Introduction (10 pages) == Books edited == Oxford International Encyclopedia of Linguistics (Mathematical Linguistics Area Editor under Editor in Chief William Frawley). 4 volumes, Oxford University Press, 2003, ISBN 978-0-19-513977-8. Proceedings of the HLT-NAACL Workshop on the Analysis of Geographic References. Jointly with Beth Sundheim. Association for Computational Linguistics, 2003, ISBN 1-932432-04-3 (WS9), paperbound, vi+81 pages. See related material. Extended Finite State Models of Language (editor). In the series Studies in Natural Language Processing, Cambridge University Press, 1999, ISBN 0-521-63198-X, hardbound, x+278 pages Contents, Introduction (7 pages). == Selected papers == Digital Language Death. PLoS ONE 8(10): e77056, 2012. [1] Hunmorph: open source word analysis (Jointly with V. Tron, Gy. Gyepesi, P. Halacsy, L. Nemeth, and D. Varga). In Proc. ACL 2005 Software Workshop 77-85 [2] Leveraging the open source ispell codebase for minority language analysis (Jointly with P. Halacsy, L. Nemeth, A. Rung, I. Szakadat, and V. Tron). In J. Carson-Berndsen (ed): Proc. SALTMIL 2004 56-59 [3] Explicit Finitism, International Journal of Theoretical Physics 2003/2 301-307 [4] Mathematical Linguistics (Jointly with G.K. Pullum) In W. Frawley (ed): Oxford International Encyclopedia of Linguistics, Oxford University Press 2003, v3 17-20 [5] Optical Character Recognition, In W. Frawley (ed): Oxford International Encyclopedia of Linguistics, Oxford University Press 2003, v3 33-34 [6] How many words are there? Glottometrics 2002/4 61-86 [7] Zipf's law outside the middle range Proc. Sixth Meeting on Mathematics of Language University of Central Florida, 1999 347-356 [8] A Robust, Language-Independent OCR System. (Jointly with Z. Lu, I. Bazzi, J. Makhoul, P. Natarajan, and R. Schwartz) In: Robert J. Mericsko (ed): Proc. 27th AIPR Workshop: Advances in Computer-Assisted Recognition SPIE Proceedings 3584 1999 [9] Quantitative Comparison of Languages. Grammars 1998/2 155-165 [10] The generative power of feature geometry. Annals of Mathematics and Artificial Intelligence 8 1993 37-46 [11] The X-bar Theory of Phrase Structure. (Jointly with G.K. Pullum) Language 66 1990 24-50 [12]

    Read more →
  • Secure state

    Secure state

    A secure state is an information systems security term to describe where entities in a computer system are divided into subjects and objects, and it can be formally proven that each state transition preserves security by moving from one secure state to another secure state. Thereby it can be inductively proven that the system is secure. As defined in the Bell–LaPadula model, the secure state is built on the concept of a state machine with a set of allowable states in a system. The transition from one state to another state is defined by transition functions. A system state is defined to be "secure" if the only permitted access modes of subjects to objects are in accordance with a security policy.

    Read more →
  • The Best Free AI Text-to-image Tool for Beginners

    The Best Free AI Text-to-image Tool for Beginners

    Looking for the best AI text-to-image tool? An AI text-to-image tool is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI text-to-image tool slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • SNNS

    SNNS

    SNNS (Stuttgart Neural Network Simulator) is a neural network simulator originally developed at the University of Stuttgart. While it was originally built for X11 under Unix, there are Windows ports. Its successor JavaNNS never reached the same popularity. == Features == SNNS is written around a simulation kernel to which user written activation functions, learning procedures and output functions can be added. It has support for arbitrary network topologies and the standard release contains support for a number of standard neural network architectures and training algorithms. == Status == There is currently no ongoing active development of SNNS. In July 2008 the license was changed to the GNU LGPL.

    Read more →
  • AI Customer-support Bots Reviews: What Actually Works in 2026

    AI Customer-support Bots Reviews: What Actually Works in 2026

    Looking for the best AI customer-support bot? An AI customer-support bot is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI customer-support bot slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • ImageMixer

    ImageMixer

    ImageMixer is a brand name of video editing software that edits digital video and still image in camcorders and authors to VCD and DVD. It is a second-party Japanese product, distributed by Pixela Corporation, a Japanese manufacturer of PC peripheral hardware and multimedia software. == Bundling == ImageMixer is widely used for several camcorder brands, such as JVC, Hitachi and Canon. Also, Sony has chosen to package ImageMixer with its DVD and HDD Handycam. == ImageMixer series == ImageMixer has other series of software for digital camera, such as ImageMixer Label Maker and ImageMixer DVD dubbing. ImageMixer also has movie editing solution for Macintosh. == Windows Vista version of ImageMixer == A Windows Vista version of ImageMixer has been developed (ImageMixer3).

    Read more →
  • Mirella Lapata

    Mirella Lapata

    Mirella Lapata is a computer scientist and Professor in the School of Informatics at the University of Edinburgh. Working on the general problem of extracting semantic information from large bodies of text, Lapata develops computer algorithms and models in the field of natural language processing (NLP). == Education == Lapata obtained a Master of Arts (MA) degree from Carnegie Mellon University and subsequently earned a doctorate from the University of Edinburgh. Lapata's doctoral research investigated the acquisition of information from polysemous linguistic units using probabilistic methods supervised by Alex Lascarides, Chris Brew and Steve Finch. == Career and research == After her doctorate, Lapata assumed academic positions at Saarland University and at the Department of Computer Science at the University of Sheffield. At the University of Edinburgh she became a reader in the School of Informatics where she is a full Professor and holds a personal chair in natural language processing. Lapata is a member of the Human Communication Research Center and Institute for Language, Cognition and Computation, both in Edinburgh. Between 2015 and 2017, Lapata served as a member of the Royal Society Machine Learning Working Group. Recently Lapata was granted a European Research Council (ERC) Consolidator Grant worth €1.9M to fund five years of her project, TransModal: Translating from Multiple Modalities into Text. === Awards and honours === In 2009 Lapata became the first recipient of the Microsoft British Computer Society (BCS)/BCS IRSG Karen Spärck Jones Award. The award recognises achievement in furthering the progress in information retrieval and natural language processing; the award commemorates the life and work of Karen Spärck Jones. In 2012 Lapata won an Empirical Methods in Natural Language Processing (EMNLP)-CoNLL 2012 Best Reviewer Award. In 2018 Lapata was awarded, alongside Li Dong, an Association for Computational Linguistics (ACL) Best Paper Honorable Mention. In 2019 Lapata was elected a Fellow of the Royal Society of Edinburgh In 2020 Lapata was elected to the Academia Europaea. In 2025 Lapata was awarded the BCS Lovelace Medal for Computing Research.

    Read more →
  • METAL MT

    METAL MT

    A machine translation system developed at the University of Texas and at Siemens which ran on Lisp Machines. == Background == Originally titled the Linguistics Research System (LRS), it was later renamed METAL (Mechanical Translation and Analysis of Languages). It started life as a German-English system funded by the USAF. == 1980 == A copy of the Weidner Multi-Lingual Word Processing software was requested by the German Government for the Siemens Corporation of Germany in September 1980 and was nicknamed the Siemens-Weidner Engine (originally English-German). This revolutionary multilingual word processing engine became foundational in the development of the Metal MT project, according to John White of the Siemens Corporation. After the Metal MT, development Rights to the Siemens-Weidner Engine were sold to a Belgium company, Lernout & Hauspie. The Siemens copy of the Weidner Multilingual Word Processing software has since been acquired through the purchase of assets of Lernout & Hauspie by Bowne Global Solutions, Inc., which was later acquired by Lionbridge Technologies, Inc. and is demonstrated in their itranslator software.

    Read more →
  • Nick Frosst

    Nick Frosst

    Nicholas M. W. Frosst is a Canadian computer scientist and musician. He co-founded Cohere, a Toronto-based artificial intelligence company. He is also the lead singer in the indie rock band Good Kid. == Early life and education == Frosst was born on January 5, 1993. Frosst earned a Bachelor of Science degree in computer science and cognitive science from the University of Toronto in 2015. He was a student of Geoffrey Hinton, who also hired Frosst at Google Brain. == Career == Frosst was among Geoffrey Hinton's earliest hires at Google Brain in Toronto, working as a machine learning researcher on deep learning and neural network architectures. He worked there from 2016 to 2020. Frosst co-founded Cohere with Aidan Gomez and Ivan Zhang in 2019. The company builds large language models and enterprise AI tools. Frosst has publicly explained Cohere's focus on industries like finance and health, where there are privacy and other regulatory considerations. Frosst has also spoken openly about his belief that artificial intelligence will not replace humans, but rather streamline and automate mundane tasks, and his belief that AGI is less "imminent" than many in the field claim. Frosst and the other Cohere co-founders were listed first on Maclean's AI Trailblazers Power List and The Logic's Innovation Leaders. == Music == After spending time in a prior band which played "weird" music featuring a glockenspiel, Frosst and fellow computer science students at the University of Toronto formed the indie rock band Good Kid in 2015. Frosst is the lead vocalist for the band. While on tour with the band, Frosst continues his work in the tech industry remotely. Frosst has described the band as way for him to relax and not constantly think about tech. His vocals have been compared to that of Kele Okereke. As of 2026, the band, which has performed at Lollapalooza, has 3.1 million monthly Spotify listeners. In 2024, the band was nominated for the Juno Awards Breakthrough Group of the Year. == Discography == === Good Kid === Can We Hang Out Sometime? (2026)

    Read more →
  • Micro stuttering

    Micro stuttering

    Micro stuttering is a visual artifact in real-time computer graphics in which the time intervals between consecutively displayed frames are uneven, even though the average frame rate reported by benchmarking software appears adequate. Tools such as 3DMark typically compute frame rates over intervals of one second or more, which can conceal momentary drops in the instantaneous frame rate that the viewer perceives as hitching or jerking of on-screen motion. At low frame rates the effect is visible as a stutter in moving images, degrading the experience in interactive applications such as video games. In severe cases a lower but more consistent frame rate can appear smoother than a higher but more erratic one. The term gained prominence in the late 2000s in discussions of multi-GPU rendering (see History), but micro stuttering also affects single-GPU systems. Common causes on modern hardware include real-time shader compilation, asset streaming from storage, VRAM exhaustion, and driver bugs. == Causes == === Shader compilation === A common cause of micro stuttering on modern PCs is real-time shader compilation. Shaders are small programs that instruct the GPU on how to render visual effects such as lighting, shadows, and reflections. On consoles, developers can pre-compile all shaders for the known, fixed hardware. On PCs, the variety of GPU architectures means shaders must often be compiled at run time, either when the game launches or during gameplay itself. When the rendering engine encounters a shader that has not yet been compiled, the CPU must finish the compilation before the GPU can draw the affected object. This causes a spike in frame time that the player perceives as a hitch. The problem has been particularly associated with games built on Unreal Engine 4 running under DirectX 12, because DX12 shifts more shader management responsibility to the application. Several techniques exist to reduce shader compilation stutter. Pipeline State Object (PSO) pre-caching records the shader permutations used at runtime so that they can be compiled in advance on subsequent launches. Asynchronous shader compilation moves the work to background CPU threads to avoid blocking the main rendering thread. Platform-level services such as Steam's shader pre-caching distribute previously compiled shaders to users with matching GPU hardware. The Steam Deck, which contains a single fixed GPU, benefits from pre-compiled shader caches because all units share the same hardware configuration. === Other causes === Micro stuttering on single-GPU systems can have several additional causes. CPU bottlenecks or scheduling interruptions from background tasks can prevent the processor from preparing frames at regular intervals. Asset streaming during gameplay (loading textures, geometry, or audio from storage) can produce hitches sometimes called traversal stutter; the use of solid-state drives and technologies such as DirectStorage has reduced but not eliminated this. VRAM exhaustion forces data to be swapped between video memory and system memory over the PCI Express bus, which is slower. Graphics driver bugs can also introduce stutter; Nvidia released hotfix driver 551.46 in February 2024 to correct intermittent micro stuttering when V-Sync was enabled. == Measurement == Micro stuttering drew attention to the limitations of average frame rate as a performance metric. In 2013, Scott Wasson at The Tech Report published a series of articles advocating frame time analysis, in which the delivery time of every individual frame is recorded and plotted rather than collapsed into a single frames-per-second figure. This approach was adopted by other hardware review publications in the following years. GPU reviews now routinely report 1% low and 0.1% low frame rates alongside the average. The 1% low is the average frame rate of the slowest 1% of frames in a sample; it serves as an indicator of worst-case smoothness. A large gap between the average and the 1% low suggests poor frame pacing. Tools for capturing per-frame timing data include FRAPS, PresentMon, OCAT, CapFrameX, and MSI Afterburner with RivaTuner Statistics Server. == Mitigation == === Frame pacing === Frame pacing is a software technique that regulates the timing of frame delivery to produce even intervals between displayed frames. Game engines, GPU drivers, and platform libraries all implement frame pacing strategies to varying degrees. On mobile platforms, Google provides the Android Frame Pacing library (Swappy) as part of the Android Game Development Kit. In December 2025, the Khronos Group published the VK_EXT_present_timing Vulkan extension, giving developers explicit control over presentation timing in a cross-platform graphics API for the first time. === Variable refresh rate === Variable refresh rate (VRR) display technologies allow a monitor's refresh rate to change to match the GPU's frame output. Implementations include Nvidia G-Sync (2013), AMD FreeSync (2015), and the VESA Adaptive-Sync standard built into DisplayPort 1.2a and later. VRR eliminates the screen tearing that results from a mismatch between frame rate and refresh rate, and avoids the frame-holding behaviour of V-Sync that can itself cause stutter. It is effective at smoothing moderate frame rate fluctuations but cannot compensate for large sudden spikes in frame time such as those caused by shader compilation or heavy asset streaming. VRR support has become standard in gaming monitors, televisions (via HDMI 2.1), and the Xbox Series X/S and PlayStation 5 consoles. === Frame generation === Beginning with DLSS 3 on the GeForce RTX 40 series in 2022, Nvidia introduced AI-based frame generation, which uses dedicated optical flow hardware and a neural network to create new frames between traditionally rendered ones. AMD followed with FSR 3 in 2023, using an algorithmic approach, and the AI-based FSR 4 for the Radeon RX 9000 series in 2025. DLSS 4, released in January 2025 for the GeForce RTX 50 series, can generate up to three frames per rendered frame using a technique called Multi Frame Generation. Frame generation increases the displayed frame rate but introduces its own frame pacing concerns. If the underlying rendered frames are unevenly timed, the interpolated frames can make the unevenness more apparent rather than less. DLSS 4 addresses this with hardware-level flip metering on the GPU's display engine, which controls the timing of frame presentation more precisely than the CPU-based pacing used in DLSS 3. Both vendors pair frame generation with latency-reduction features (Nvidia Reflex and AMD Anti-Lag+) to offset the additional input latency that results from inserting synthetic frames into the pipeline. === Frame rate limiters === Capping the frame rate below the display's maximum refresh rate, using tools such as RivaTuner Statistics Server, in-game limiters, or driver-level settings, is a common way to improve frame pacing. Preventing the GPU from running ahead of the display reduces variability in frame delivery times and can produce a smoother result than an uncapped but more irregular frame rate. == History == === Multi-GPU configurations === Micro stuttering was first widely documented in the late 2000s as a side effect of multi-GPU configurations using Alternate Frame Rendering (AFR), in which consecutive frames are assigned to alternating GPUs. Because each GPU may take a different amount of time to complete its assigned frame — due to varying scene complexity, driver scheduling, or inter-GPU communication overhead — the resulting frame delivery is irregular even when the average frame rate is high. Both Nvidia SLI and AMD CrossFireX were affected, with dual-GPU setups exhibiting the worst frame pacing irregularities. In 2012 benchmarks using Battlefield 3, dual Radeon HD 7970 cards in CrossFire showed 85% variation in frame delivery times compared with 7% for a single card, while dual GeForce GTX 680 cards in SLI showed only 7% variation compared with 5% for a single card. Multi-GPU micro stuttering became a significant factor in the eventual decline and discontinuation of consumer multi-GPU gaming. Nvidia restricted SLI to a handful of enthusiast-class cards from the GeForce 10 series onward, then replaced it with NVLink on the GeForce RTX 20 series, which saw limited gaming adoption. AMD ceased active CrossFire development around 2017. By the mid-2020s, neither vendor's current consumer GPUs support multi-GPU rendering for games. Other factors that contributed to the decline include DirectX 12 placing multi-GPU support in the hands of game developers rather than driver authors, the incompatibility of temporal anti-aliasing and other temporal rendering techniques with AFR, and the increasing size, power draw, and cost of individual GPUs. The third-party utility RadeonPro could reduce CrossFire micro stuttering through dynamic V-Sync and frame pacing adjustments, and AMD later introduced a driver-level frame paci

    Read more →
  • Xu Li (computer scientist)

    Xu Li (computer scientist)

    Xu Li is a Chinese computer scientist and co-founder and current CEO of SenseTime, an artificial intelligence (AI) company. Xu has led SenseTime since the company's incorporation and helped it independently develop its proprietary deep learning platform. == Education and research == Xu obtained both his bachelor's and master's degrees in computer science from Shanghai Jiao Tong University. He received his doctorate in computer science from the Chinese University of Hong Kong. Xu has published more than 50 papers at international conferences and in journals in the field of computer vision and won the Best Paper Award at the international conference on Non-Photorealistic Rendering and Animation (NPAR) 2012 and the Best Reviewer Award at the international conferences Asian Conference on Computer Vision ACCV 2012 and International Conference on Computer Vision (ICCV) 2015. He has three algorithms that have been included into the visual open-source platform OpenCV, and his "L0 Smoothing" algorithm garnered the most citations in research papers over a span of five years (2011–2015) within the ACM Transactions on Graphics (TOG), a scientific journal that Thomson Reuters InCites has placed first among software engineering journals. == Career == Previously, Xu worked at Lenovo Corporate Research & Development. He was also a visiting researcher at Motorola China R&D Institute, Omron Research Institute, and Microsoft Research. == Selected publications == Jimmy Ren, Xiaohao Chen, Jianbo Liu, Wenxiu Sun, Li Xu, Jiahao Pang, Qiong Yan, Yu-wing Tai, "Accurate Single Stage Detector Using Recurrent Rolling Convolution", (CVPR), 2017. Jimmy SJ. Ren, Yongtao Hu, Yu-Wing Tai, Chuan Wang, Li Xu, Wenxiu Sun, Qiong Yan, "Look, Listen and Learn – A Multimodal LSTM for Speaker Identification", The 30th AAAI Conference on Artificial Intelligence (AAAI), 2016 Jimmy SJ. Ren, Li Xu, Qiong Yan, Wenxiu Sun, "Shepard Convolutional Neural Networks" Advances in Neural Information Processing Systems (NIPS), 2015. Xiaoyong Shen, Chao Zhou, Li Xu, Jiaya Jia, "Mutual-Structure for Joint Filtering" International Conference on Computer Vision (ICCV), (oral presentation), 2015. Jianping Shi, Qiong Yan, Li Xu, Jiaya Jia, "Hierarchical Image Saliency Detection on Extended CSSD" IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015. Jianping Shi, Xin Tao, Li Xu, Jiaya Jia, "Break Ames Room Illusion: Depth from General Single Images" ACM Transactions on Graphics (TOG), (Proc. ACM SIGGRAPH ASIA2015). Yongtao Hu, Jimmy SJ. Ren, Jingwen Dai, Chang Yuan, Li Xu, Wenping Wang, "Deep Multimodal Speaker Naming" ACM International Conference on Multimedia (MM), 2015. Li Xu, Jimmy SJ. Ren, Qiong Yan, Renjie Liao, Jiaya Jia "Deep Edge-Aware Filters" International Conference on Machine Learning (ICML), 2015. Jianping Shi, Li Xu, Jiaya Jia "Just Noticeable Defocus Blur Detection and Estimation" IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. Ziyang Ma, Renjie Liao, Xin Tao, Li Xu, Jiaya Jia, Enhua Wu "Handling Motion Blur in Multi-Frame Super-Resolution" IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. Xiaoyong Shen, Qiong Yan, Li Xu, Lizhuang Ma, Jiaya Jia"Multispectral Joint Image Restoration via Optimizing a Scale Map" IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015. Jimmy SJ. Ren, Li Xu, "On Vectorization of Deep Convolutional Neural Networks for Vision Tasks" AAAI Conference on Artificial Intelligence (AAAI), 2015. == Awards and honors == Xu was ranked 7th in Fortune magazine's 2018 edition of its 40 Under 40. He was also named "China's Outstanding AI Industry Leader" by The Economic Observer, received the "Innovative Business Leader" Award under NetEase's "Future Technology Talent Awards", and was honored as Sina's "2017 Top Ten Economic Figures". In 2018, Xu was named EY's "Entrepreneur of the Year China" in the Technology category.

    Read more →
  • Frederick Jelinek

    Frederick Jelinek

    Frederick Jelinek (18 November 1932 – 14 September 2010) was a Czech-American researcher in information theory, automatic speech recognition, and natural language processing. He is well known for his oft-quoted statement, "Every time I fire a linguist, the performance of the speech recognizer goes up". Jelinek was born in Czechoslovakia before World War II and emigrated with his family to the United States in the early years of the communist regime. He studied engineering at the Massachusetts Institute of Technology and taught for 10 years at Cornell University before accepting a job at IBM Research. In 1961, he married Czech screenwriter Milena Jelinek. At IBM, his team advanced approaches to computer speech recognition and machine translation. After IBM, he went to head the Center for Language and Speech Processing at Johns Hopkins University for 17 years, where he was still working on the day he died. == Personal life == Jelinek was born on November 18, 1932, as Bedřich Jelínek in Kladno to Vilém and Trude Jelínek. His father was Jewish; his mother was born in Switzerland to Czech Catholic parents and had converted to Judaism. Jelínek senior, a dentist, had planned early to escape Nazi occupation and flee to England; he arranged for a passport, visa, and the shipping of his dentistry materials. The couple planned to send their son to an English private school. However, Vilém decided to stay at the last minute and was eventually sent to the Theresienstadt concentration camp, where he died in 1945. The family was forced to move to Prague in 1941, but Frederick, his sister and mother—thanks to the latter's background—escaped the concentration camps. After the war, Jelinek entered in the gymnasium, despite having missed several years of schooling because education of Jewish children had been forbidden since 1942. His mother, anxious that her son should get a good education, made great efforts for their emigration, especially when it became clear he would not be allowed to even attempt the graduation examination. His mother hoped her son would become a physician, but Jelinek dreamed of being a lawyer. He studied engineering in evening classes at the City College of New York and received stipends from the National Committee for a Free Europe that allowed him to study at the Massachusetts Institute of Technology. About his choice of specialty, he said: "Fortunately, to electrical engineering there belonged a discipline whose aim was not the construction of physical systems: the theory of information". He obtained his Ph.D. in 1962, with Robert Fano as his adviser. In 1957, Jelinek paid an unexpected visit to Prague. He had been in Vienna and applied for a visa, hoping to see his former acquaintances again. He met with his old friend Miloš Forman, who introduced him to film student Milena Tobolová—whose screenplay had been the basis for the movie Easy Life (Snadný život). His flight back to the U.S. had a stopover in Munich, during which he called her to propose. Tobolová was considered a dissident and the authorities were not happy with her film. Jelinek asked for help from Jerome Wiesner and Cyrus Eaton, the latter who lobbied Nikita Khrushchev. Following the inauguration of John F. Kennedy, a group of Czech dissidents were allowed to emigrate in January 1961. Thanks to the lobbying, the future Milena Jelinek was one of them. After completing his graduate studies, Jelinek, who had developed an interest in linguistics, had plans to work with Charles F. Hockett at Cornell University. However these fell through and during the next ten years he continued to study information theory. Having previously worked at IBM during a sabbatical, he began full-time work there in 1972—at first on leave for Cornell, but permanently from 1974. He remained there for over twenty years. Although at first he had been offered a regular research job, upon his arrival he learned that Josef Raviv had recently been promoted to head of the newly opened IBM Haifa Research Laboratory, and became head of the Continuous Speech Recognition group at the Thomas J. Watson Research Center. Despite his team's successes in this area, Jelinek's work remained little known in his home country because Czech scientists were not allowed to participate in key conferences. After the 1989 fall of communism, Jelinek helped establish scientific relationships, regularly visiting to lecture and helping to persuade IBM to establish a computing centre at Charles University. In 1993, he retired from IBM and went to Johns Hopkins University's Center for Language and Speech Processing, where he was director and Julian Sinclair Smith Professor of Electrical and Computer Engineering. He was still working there at the time of his death; Jelinek died of a heart attack at the close of an otherwise normal workday in mid-September 2010. He was survived by his wife, daughter and son, sister, stepsister, and three grandchildren, including Sophie Gold Jelinek. == Research and legacy == Information theory was a fashionable scientific approach in the mid '50s. However, pioneer Claude Shannon wrote in 1956 that this trendiness was dangerous. He said, "Our fellow scientists in many different fields, attracted by the fanfare and by the new avenues opened to scientific analysis, are using these ideas in their own problems ... It will be all too easy for our somewhat artificial prosperity to collapse overnight when it is realized that the use of a few exciting words like information, entropy, redundancy, do not solve all our problems." During the next decade, a combination of factors shut down the application of information theory to natural language processing (NLP) problems—in particular machine translation. One factor was the 1957 publication of Noam Chomsky's Syntactic Structures, which stated, "probabilistic models give no insight into the basic problems of syntactic structure". This accorded well with the philosophy of the artificial intelligence research of the time, which promoted rule-based approaches. The other factor was the 1966 ALPAC report, which recommended that the government should stop funding research into machine translation. ALPAC chairman John Pierce later said that the field was filled with "mad inventors or untrustworthy engineers". He said that the underlying linguistic problems must be solved before attempts at NLP could be reasonably made. These elements essentially halted research in the field. Jelinek had begun to develop an interest in linguistics after the immigration of his wife, who initially enrolled in the MIT linguistics program with the help of Roman Jakobson. Jelinek often accompanied her to Chomsky's lectures, and even discussed the possibility of changing orientation with his adviser. Fano was "really upset", and after the failure of his project with Hockett at Cornell, he did not return to this field of research until starting work at IBM. The scope of research at IBM was considerably different from that of most other teams. According to Mark Liberman, "While [Jelinek] was leading IBM's effort to solve the general dictation problem during the decade or so following 1972, most other U.S. companies and academic researchers were working on very limited problems ... or were staying out of the field entirely". Jelinek regarded speech recognition as an information theory problem—a noisy channel, in this case the acoustic signal—which some observers considered a daring approach. The concept of perplexity was introduced in their first model, New Raleigh Grammar, which was published in 1976 as the paper "Continuous Speech Recognition by Statistical Methods" in the journal Proceedings of the IEEE. According to Young, the basic noisy channel approach "reduced the speech recognition problem to one of producing two statistical models". Whereas New Raleigh Grammar was a hidden Markov model, their next model, called Tangora, was broader and involved n-grams, specifically trigrams. Even though "it was obvious to everyone that this model was hopelessly impoverished", it was not improved upon until Jelinek presented another paper in 1999. The same trigram approach was applied to phones in single words. Although the identification of parts of speech turned out not to be very useful for speech recognition, tagging methods developed during these projects are now used in various NLP applications. The incremental research techniques developed at IBM eventually became dominant in the field after DARPA, in the mid-80s, returned to NLP research and imposed that methodology to participating teams, shared common goals, data, and precise evaluation metrics. The Continuous Speech Recognition Group's research, which required large amounts of data to train the algorithms, eventually led to the creation of the Linguistic Data Consortium. In the 1980s, although the broader problem of speech recognition remained unsolved, they sought to apply the methods developed to other problems; machine translat

    Read more →