Django (web framework)

Django ( JANG-goh; sometimes stylized as django) is a free and open-source, Python-based web framework that runs on a web server. It follows the model–template–views (MTV) architectural pattern. It is maintained by the Django Software Foundation (DSF), an independent organization established in the US as a 501(c)(3) non-profit. Django's primary goal is to ease the creation of complex, database-driven websites. The framework emphasizes reusability and "pluggability" of components, less code, low coupling, rapid development, and the principle of don't repeat yourself. Python is used throughout, even for settings, files, and data models. Django also provides an optional administrative create, read, update and delete interface that is generated dynamically through introspection and configured via admin models. Some well-known sites that use Django include Instagram, Mozilla, Disqus, Bitbucket, Nextdoor, and Clubhouse. == History == Django was created in the autumn of 2003, when the web programmers at the Lawrence Journal-World newspaper, Adrian Holovaty and Simon Willison, began using Python to build applications. Jacob Kaplan-Moss was hired early in Django's development shortly before Willison's internship ended. It was released publicly under a BSD license in July 2005. The framework was named after guitarist Django Reinhardt. Holovaty is a romani jazz guitar player inspired in part by Reinhardt's music. In June 2008, it was announced that a newly formed Django Software Foundation (DSF) would maintain Django in the future. == Features == === Components === Despite having its own nomenclature, such as naming the callable objects generating the HTTP responses "views", the core Django framework can be seen as an MVC architecture. It consists of an object-relational mapper (ORM) that mediates between data models (defined as Python classes) and a relational database ("Model"), a system for processing HTTP requests with a web templating system ("View"), and a regular-expression-based URL dispatcher ("Controller"). Also included in the core framework are: a lightweight and standalone web server for development and testing a form serialization and validation system that can translate between HTML forms and values suitable for storage in the database a template system that utilizes the concept of inheritance borrowed from object-oriented programming a caching framework that can use any of several cache methods support for middleware classes that can intervene at various stages of request processing and carry out custom functions an internal dispatcher system that allows components of an application to communicate events to each other via pre-defined signals an internationalization system, including translations of Django's own components into a variety of languages a serialization system that can produce and read XML and/or JSON representations of Django model instances a system for extending the capabilities of the template engine an interface to Python's built-in unit test framework === Bundled applications === The main Django distribution also bundles a number of applications in its "contrib" package, including: an extensible authentication system the dynamic administrative interface tools for generating RSS and Atom syndication feeds a "Sites" framework that allows one Django installation to run multiple websites, each with their own content and applications tools for generating Sitemaps built-in mitigation for cross-site request forgery, cross-site scripting, SQL injection, password cracking and other typical web attacks, most of them turned on by default a framework for creating geographic information system (GIS) applications === Extensibility === Django's configuration system allows third-party code to be plugged into a regular project, provided that it follows the reusable app conventions. More than 5000 packages are available to extend the framework's original behavior, providing solutions to issues the original tool didn't tackle: registration, search, API provision and consumption, CMS, etc. This extensibility is, however, mitigated by internal components' dependencies. While the Django philosophy implies loose coupling, the template filters and tags assume one engine implementation, and both the auth and admin bundled applications require the use of the internal ORM. None of these filters or bundled apps are mandatory to run a Django project, but reusable apps tend to depend on them, encouraging developers to keep using the official stack in order to benefit fully from the apps ecosystem. === Server arrangements === Django can be run on ASGI or WSGI-compliant web servers. Django officially supports five database backends: PostgreSQL, MySQL, MariaDB, SQLite, and Oracle. Microsoft SQL Server can be used with mssql-django. == Version history == The Django team will occasionally designate certain releases to be "long-term support" (LTS) releases. LTS releases will get security and data loss fixes applied for a guaranteed period of time, typically 3+ years, regardless of the pace of releases afterwards. == Community == === DjangoCon === There is a semiannual conference for Django developers and users, named "DjangoCon", that has been held since September 2008. DjangoCon is held annually in Europe, in May or June; while another is held in the United States in August or September, in various cities. ==== United States ==== The 2012 DjangoCon took place in Washington, D.C., from September 3 to 8. 2013 DjangoCon was held in Chicago at the Hyatt Regency Hotel and the post-conference Sprints were hosted at Digital Bootcamp, computer training center. The 2014 DjangoCon US returned to Portland, OR from August 30 to 6 September. The 2015 DjangoCon US was held in Austin, TX from September 6 to 11 at the AT&T Executive Center. The 2016 DjangoCon US was held in Philadelphia, PA at The Wharton School of the University of Pennsylvania from July 17 to 22. The 2017 DjangoCon US was held in Spokane, WA; in 2018 DjangoCon US was held in San Diego, CA. DjangoCon US 2019 was held again in San Diego, CA from September 22 to 27. DjangoCon 2021 took place virtually and in 2022, DjangoCon US returned to San Diego from October 16 to 21. DjangoCon US 2023 was held from October 16 to 20 at the Durham, NC convention center and DjangoCon US 2024 took place also in Durham in September 22 to 27. DjangoCon US 2025 was held from September 8 to 12 in Chicago, Illinois. ==== Europe ==== The 2025 edition of DjangoCon Europe took place in Dublin, Ireland from 23 to 27 April. In 2024, the conference was hosted in Vigo, Spain. Edinburgh, Scotland served as the venue for DjangoCon Europe in 2023. The 2022 conference was organized in Porto, Portugal. In 2021, DjangoCon Europe was held virtually due to the COVID-19 pandemic. The 2020 edition was also conducted as a fully virtual event. DjangoCon Europe 2019 was held in Copenhagen, Denmark. In 2018, the event took place in Heidelberg, Germany. The 2017 conference was convened in Florence, Italy. DjangoCon Europe 2012 was organized in Zurich, Switzerland. ==== Australia ==== Django mini-conferences are usually held every year as part of the Australian Python Conference 'PyCon AU'. Previously, these mini-conferences have been held in: Hobart, Australia, in July 2013, Brisbane, Australia, in August 2014 and 2015, Melbourne, Australia in August 2016 and 2017, and Sydney, Australia, in August 2018 and 2019. ==== Africa ==== The first DjangoCon Africa was held in Zanzibar, Tanzania, from 6 to 11 November 2023. The event hosted approximately 200 attendees from 22 countries, including 103 women. The conference featured 26 talks on topics such as software development, education, careers, accessibility, and agriculture, often highlighting perspectives from across the African continent. Future editions of the conference are planned, with details available on the official website === Community groups & programs === Django has spawned user groups and meetups around the world, a notable group is the Django Girls organization, which began in Poland but now has had events in 91 countries. Another initiative is Djangonaut Space, a mentorship program aimed at supporting new contributors to the Django ecosystem. The program pairs experienced mentors with developers to guide them through making meaningful contributions to Django and its community. It emphasizes long-term engagement, inclusion, and collaborative open-source development. == Ports to other languages == Programmers have ported Django's template engine design from Python to other languages, providing decent cross-platform support. Some of these options are more direct ports; others, though inspired by Django and retaining its concepts, take the liberty to deviate from Django's design: Liquid for Ruby Template::Swig for Perl Twig for PHP and JavaScript Jinja for Python ErlyDTL for Erlang == CMSs based on Django Framework == Django as a framework is capable of building a complete CMS

Patent visualisation

Patent visualisation is an application of information visualisation. The number of patents has been increasing, encouraging companies to consider intellectual property as a part of their strategy. Patent visualisation, like patent mapping, is used to quickly view a patent portfolio. Software dedicated to patent visualisation began to appear in 2000, for example Aureka from Aurigin (now owned by Thomson Reuters). Many patent and portfolio analytics platforms, such as Questel, Patent Forecast, PatSnap, Patentcloud, Relecura, and Patent iNSIGHT Pro, offer options to visualise specific data within patent documents by creating topic maps, priority maps, IP Landscape reports, etc. Software converts patents into infographics or maps, to allow the analyst to "get insight into the data" and draw conclusions. Also called patinformatics, it is the "science of analysing patent information to discover relationships and trends that would be difficult to see when working with patent documents on a one-and-one basis". Patents contain structured data (like publication numbers) and unstructured text (like title, abstract, claims and visual info). Structured data are processed by data-mining and unstructured data are processed with text-mining. == Data mining == The main step in processing structured information is data-mining, which emerged in the late 1980s. Data mining involves statistics, artificial intelligence, and machine learning. Patent data mining extracts information from the structured data of the patent document. These structured data are bibliographic fields such as location, date or status. === Structured fields === === Advantages === Data mining allows study of filing patterns of competitors and locates main patent filers within a specific area of technology. This approach can be helpful to monitor competitors' environments, moves and innovation trends and gives a macro view of a technology status. == Text-mining == === Principle === Text mining is used to search through unstructured text documents. This technique is widely used on the Internet, it has had success in bioinformatics and now in the intellectual property environment. Text mining is based on a statistical analysis of word recurrence in a corpus. An algorithm extracts words and expressions from title, summary and claims and gathers them by declension. "And" and "if" are labeled as non-information bearing words and are stored in the stopword list. Stoplists can be specialised in order to create an accurate analysis. Next, the algorithm ranks the words by weight, according to their frequency in the patent's corpus and the document frequency containing this word. The score for each word is calculated using a formula such as: W e i g h t = T e r m F r e q u e n c y D o c u m e n t F r e q u e n c y = F r e q u e n c y o f t h e w o r d o r e x p r e s s i o n i n t h e T e x t S e a N u m b e r o f d o c u m e n t s c o n t a i n i n g t h e e x p r e s s i o n o r w o r d {\displaystyle Weight={\frac {Term\ Frequency}{Document\ Frequency}}={\frac {Frequency\ of\ the\ word\ or\ expression\ in\ the\ Text\ Sea}{Number\ of\ documents\ containing\ the\ expression\ or\ word}}} A frequently used word in several documents has less weight than a word used frequently in a few patents. Words under a minimum weight are eliminated, leaving a list of pertinent words or descriptors. Each patent is associated to the descriptors found in the selected document. Further, in the process of clusterisation, these descriptors are used as subsets, in which the patent are regrouped or as tags to place the patents in predetermined categories, for example keywords from International Patent Classifications. Four text parts can be processed with text-mining : Title Abstract Claim Patent Full-Text Software offer different combinations but title, abstract and claim are generally the most used, providing a good balance between interferences and relevancy. === Advantages === Text-mining can be used to narrow a search or quickly evaluate a patent corpus. For instance, if a query produces irrelevant documents, a multi-level clustering hierarchy identifies them in order to delete them and refine the search. Text-mining can also be used to create internal taxonomies specific to a corpus for possible mapping. == Visualisations == Allying patent analysis and informatic tools offers an overview of the environment through value-added visualisations. As patents contain structured and unstructured information, visualisations fall in two categories. Structured data can be rendered with data mining in macrothematic maps and statistical analysis. Unstructured information can be shown in like clouds, cluster maps and 2D keyword maps. === Data mining visualisation === === Text mining visualisation === === Visualisation for both data-mining and text-mining === Mapping visualisations can be used for both text-mining and data-mining results. == Uses == What patent visualisation can highlight: Competitors Partners New innovations Technologic environment description Networks Field application: R&D strategy management Competitive intelligence Licensing Strategy

LTE Advanced

LTE Advanced, also named or recognized as LTE+, LTE-A or 4G+, is a 4G mobile cellular communication standard developed by 3GPP as a major enhancement of the Long Term Evolution (LTE) standard. Three technologies from the LTE-Advanced tool-kit – carrier aggregation, 4x4 MIMO and 256QAM modulation in the downlink – if used together and with sufficient aggregated bandwidth, can deliver maximum peak downlink speeds approaching, or even exceeding, 1 Gbit/s. This is significantly more than the peak 300 Mbit/s rate offered by the preceding LTE standard. Later developments have resulted in LTE Advanced Pro (or 4.9G) which increases bandwidth even further. The first ever LTE Advanced network was deployed in 2013 by SK Telecom in South Korea. In August 2019, the Global mobile Suppliers Association (GSA) reported that there were 304 commercially launched LTE-Advanced networks in 134 countries. Overall, 335 operators are investing in LTE-Advanced (in the form of tests, trials, deployments or commercial service provision) in 141 countries. == Name == LTE Advanced is also named (indicated as) LTE+, LTE-A, or (on Samsung Galaxy and Xiaomi smartphones) as 4G+. Such networks have also often been described as ‘Gigabit LTE networks’ mirroring a term that is also used in the fixed broadband industry. == History == The mobile communication industry and standards organizations have therefore started work on 4G access technologies, such as LTE Advanced. At a workshop in April 2008 in China, 3GPP agreed the plans for work on Long Term Evolution (LTE). A first set of specifications were approved in June 2008. Besides the peak data rate 1 Gb/s as defined by the ITU-R, it also targets faster switching between power states and improved performance at the cell edge. Detailed proposals are being studied within the working groups. The LTE+ format was first proposed by NTT DoCoMo of Japan and has been adopted as the international standard. It was formally submitted as a candidate 4G to ITU-T in late 2009 as meeting the requirements of the IMT-Advanced standard, and was standardized by the 3rd Generation Partnership Project (3GPP) in March 2011 as 3GPP Release 10. The work by 3GPP to define a 4G candidate radio interface technology started in Release 9 with the study phase for LTE-Advanced. Being described as a 3.9G (beyond 3G but pre-4G), the first release of LTE did not meet the requirements for 4G (also called IMT Advanced as defined by the International Telecommunication Union) such as peak data rates up to 1 Gb/s. The ITU has invited the submission of candidate Radio Interface Technologies (RITs) following their requirements in a circular letter, 3GPP Technical Report (TR) 36.913, "Requirements for Further Advancements for E-UTRA (LTE-Advanced)." These are based on ITU's requirements for 4G and on operators’ own requirements for advanced LTE. Major technical considerations include the following: Continual improvement to the LTE radio technology and architecture Scenarios and performance requirements for working with legacy radio technologies Backward compatibility of LTE-Advanced with LTE. An LTE terminal should be able to work in an LTE-Advanced network and vice versa. Any exceptions will be considered by 3GPP. Consideration of recent World Radiocommunication Conference (WRC-07) decisions regarding frequency bands to ensure that LTE-Advanced accommodates the geographically available spectrum for channels above 20 MHz. Also, specifications must recognize those parts of the world in which wideband channels are not available. Likewise, 'WiMAX 2', 802.16m, has been approved by ITU as the IMT Advanced family. WiMAX 2 is designed to be backward compatible with WiMAX 1 devices. Most vendors now support conversion of 'pre-4G', pre-advanced versions and some support software upgrades of base station equipment from 3G. == Proposals == The target of 3GPP LTE Advanced is to reach and surpass the ITU requirements. LTE Advanced should be compatible with first release LTE equipment, and should share frequency bands with first release LTE. In the feasibility study for LTE Advanced, 3GPP determined that LTE Advanced would meet the ITU-R requirements for 4G. The results of the study are published in 3GPP Technical Report (TR) 36.912. One of the important LTE Advanced benefits is the ability to take advantage of advanced topology networks; optimized heterogeneous networks with a mix of macrocells with low power nodes such as picocells, femtocells and new relay nodes. The next significant performance leap in wireless networks will come from making the most of topology, and brings the network closer to the user by adding many of these low power nodes – LTE Advanced further improves the capacity and coverage, and ensures user fairness. LTE Advanced also introduces multicarrier to be able to use ultra wide bandwidth, up to 100 MHz of spectrum supporting very high data rates. In the research phase many proposals have been studied as candidates for LTE Advanced (LTE-A) technologies. The proposals could roughly be categorized into: Support for relay node base stations Coordinated multipoint (CoMP) transmission and reception UE Dual TX antenna solutions for SU-MIMO and diversity MIMO, commonly referred to as 2x2 MIMO Scalable system bandwidth exceeding 20 MHz, up to 100 MHz Carrier aggregation of contiguous and non-contiguous spectrum allocations Local area optimization of air interface Nomadic / Local Area network and mobility solutions Flexible spectrum usage Cognitive radio Automatic and autonomous network configuration and operation Support of autonomous network and device test, measurement tied to network management and optimization Enhanced precoding and forward error correction Interference management and suppression Asymmetric bandwidth assignment for FDD Hybrid OFDMA and SC-FDMA in uplink UL/DL inter eNB coordinated MIMO SONs, Self Organizing Networks methodologies Within the range of system development, LTE-Advanced and WiMAX 2 can use up to 8x8 MIMO and 128-QAM in downlink direction. Example performance: 100 MHz aggregated bandwidth, LTE-Advanced provides almost 3.3 Gbit peak download rates per sector of the base station under ideal conditions. Advanced network architectures combined with distributed and collaborative smart antenna technologies provide several years road map of commercial enhancements. The 3GPP standards Release 12 added support for 256-QAM. A summary of a study carried out in 3GPP can be found in TR36.912. == Timeframe and introduction of additional features == Original standardization work for LTE-Advanced was done as part of 3GPP Release 10, which was frozen in April 2011. Trials were based on pre-release equipment. Major vendors support software upgrades to later versions and ongoing improvements. In order to improve the quality of service for users in hotspots and on cell edges, heterogeneous networks (HetNets) are formed of a mixture of macro-, pico- and femto base stations serving corresponding-size areas. Frozen in December 2012, 3GPP Release 11 concentrates on better support of HetNet. Coordinated Multi-Point operation (CoMP) is a key feature of Release 11 in order to support such network structures. Whereas users located at a cell edge in homogenous networks suffer from decreasing signal strength compounded by neighbor cell interference, CoMP is designed to enable use of a neighboring cell to also transmit the same signal as the serving cell, enhancing quality of service on the perimeter of a serving cell. In-device Co-existence (IDC) is another topic addressed in Release 11. IDC features are designed to ameliorate disturbances within the user equipment caused between LTE/LTE-A and the various other radio subsystems such as WiFi, Bluetooth, and the GPS receiver. Further enhancements for MIMO such as 4x4 configuration for the uplink were standardized. The higher number of cells in HetNet results in user equipment changing the serving cell more frequently when in motion. The ongoing work on LTE-Advanced in Release 12, amongst other areas, concentrates on addressing issues that come about when users move through HetNet, such as frequent hand-overs between cells. It also included use of 256-QAM. == First technology demonstrations and field trials == This list covers technology demonstrations and field trials up to the year 2014, paving the way for a wider commercial deployment of the VoLTE technology worldwide. From 2014 onwards various further operators trialled and demonstrated the technology for future deployment on their respective networks. These are not covered here. Instead a coverage of commercial deployments can be found in the section below. == LTE Advanced Pro == LTE Advanced Pro (LTE-A Pro, also known as 4.5G, 4.5G Pro, 4.9G, Pre-5G, 5G Project) is a name for 3GPP release 13 and 14. It is an evolution of LTE Advanced (LTE-A) cellular standard supporting data rates in excess of 3 Gbit/s using 32-carrier aggregation. It also introduces th

Common-mode signal

In electrical engineering, a common-mode signal is the identical component of voltage present at both input terminals of an electrical device. In telecommunication, the common-mode signal on a transmission line is also known as longitudinal voltage. Common-mode interference (CMI) is a type of common-mode signal. Common-mode interference is interference that appears on both signal leads, or coherent interference that affects two or more elements of a network. In most electrical circuits, desired signals are transferred by a differential voltage between two conductors. If the voltages on these conductors are U1 and U2, the common-mode signal is the average of the voltages: U cm = U 1 + U 2 2 {\displaystyle U_{\text{cm}}={\frac {U_{1}+U_{2}}{2}}} When referenced to the local common or ground, a common-mode signal appears on both lines of a two-wire cable, in phase and with equal amplitudes. Technically, a common-mode voltage is one-half the vector sum of the voltages from each conductor of a balanced circuit to local ground or common. Such signals can arise from one or more of the following sources: Radiated signals coupled equally to both lines, An offset from signal common created in the driver circuit, or A ground differential between the transmitting and receiving locations. Noise induced into a cable, or transmitted from a cable, usually occurs in the common mode, as the same signal tends to be picked up by both conductors in a two-wire cable. Likewise, RF noise transmitted from a cable tends to emanate from both conductors. Elimination of common-mode signals on cables entering or leaving electronic equipment is important to ensure electromagnetic compatibility. Unless the intention is to transmit or receive radio signals, an electronic designer generally designs electronic circuits to minimise or eliminate common-mode effects. == Methods of eliminating common-mode signals == Differential amplifiers or receivers that respond only to voltage differences, e.g. those between the wires that constitute a pair. This method is particularly suited for instrumentation where signals are transmitted through DC bias. For sensors with very high output impedance that require very high common-mode rejection ratio, a differential amplifier is combined with input buffers to form an instrumentation amplifier. An inductor where a pair of signaling wires follow the same path through the inductor, e.g. in a bifilar winding configuration such as used in Ethernet magnetics. Useful for AC and DC signals, but will filter only higher frequency common-mode signals. A transformer, which is useful for AC signals only, and will filter any form of common-mode noise, but may be used in combination with a bifilar wound coil to eliminate capacitive coupling of higher frequency common-mode signals across the transformer. Used in twisted pair Ethernet. Common-mode filtering may also be used to prevent egress of noise for electromagnetic compatibility purposes: High frequency common-mode signals (e.g., RF noise from a computing circuit) may be blocked using a ferrite bead clamped to the outside of a cable. These are often observable on laptop computer power supplies near the jack socket, and good quality mouse or printer USB cables and HDMI cables. Switch mode power supplies include common and differential mode filtering inductors to block the switching signal noise returning into mains wiring. Common-mode rejection ratio is a measure of how well a circuit eliminates common-mode interference.

SPACEMAP

SPACEMAP (Korean: 스페이스맵) is a South Korean satellite orbit optimization and satellite communications company headquartered in Seoul, South Korea. The company was founded in 2021 by CEO, Douglas Deok-Soo Kim, as an offshoot of Hanyang University. It was funded by the Leader Research grant from the National Research Foundation of Korea with the goal of capitalizing on the growing space industry. == History == Kim initially began research into Voronoi diagrams at the University of Michigan. He met with Dr. Misoon Ma, former director of the Asia Division of the U.S. Air Force Office of Scientific Research (AFOSR) and was recruited to work with the U.S. Air force, using Voronoi diagrams for a satellite collision prevention program. After his work with the U.S. Air Force, Kim founded SPACEMAP Inc in September 2021. In 2023, the company was selected by Korea's Tech Incubator Program for Startups (TIPS) to be funded up to 17 billion KRW (approx. US$13 million) in 3 years. == Technology == The services provided by SPACEMAP are based on using dynamic Voronoi diagrams to predict satellite orbits with the aim of enhancing space mission safety and efficiency. For complex problems involving many moving points, Voronoi diagrams maintain a near-constant computation time regardless of the number of points involved. By utilizing Voronoi diagrams and artificial intelligence, the software can easily determine the number of neighboring satellites surrounding a specific satellite and calculate the distances between them, thereby predicting the probability of a collision. SPACEMAP claims their method to be superior in computational time and memory efficiency, compared to the previously established three-filter method. == Products == SPACEMAP offers satellite products and services including the following: AstroOne, a conjunction assessment, and optimal collision avoidance service for all space vehicles in both orbital and non-orbital motions. AstroOrca, providing data transmission for satellites in multiple orbits, launch optimization, shuttle logistics for space gas stations, and Active Debris Removal (ADR) itinerary. AstroLibrary, a library of RESTful APIs to access the C++ implementation of SPACEMAP's Voronoi diagram algorithms wrapped in a Python interface. It also provides real-time tracking of the North Korean reconnaissance satellite, Malligyong-1.

Automatic summarization

Automatic summarization is the process of shortening a set of data computationally, to create a subset (a summary) that represents the most important or relevant information within the original content. Artificial intelligence (AI) algorithms are commonly developed and employed to achieve this, specialized for different types of data. Text summarization is usually implemented by natural language processing methods, designed to locate the most informative sentences in a given document. On the other hand, visual content can be summarized using computer vision algorithms. Image summarization is the subject of ongoing research; existing approaches typically attempt to display the most representative images from a given image collection, or generate a video that only includes the most important content from the entire collection. Video summarization algorithms identify and extract from the original video content the most important frames (key-frames), and/or the most important video segments (key-shots), normally in a temporally ordered fashion. Video summaries simply retain a carefully selected subset of the original video frames and, therefore, are not identical to the output of video synopsis algorithms, where new video frames are being synthesized based on the original video content. == Commercial products == In 2022 Google Docs released an automatic summarization feature. == Approaches == There are two general approaches to automatic summarization: extraction and abstraction. === Extraction-based summarization === Here, content is extracted from the original data, but the extracted content is not modified in any way. Examples of extracted content include key-phrases that can be used to "tag" or index a text document, or key sentences (including headings) that collectively comprise an abstract, and representative images or video segments, as stated above. For text, extraction is analogous to the process of skimming, where the summary (if available), headings and subheadings, figures, the first and last paragraphs of a section, and optionally the first and last sentences in a paragraph are read before one chooses to read the entire document in detail. Other examples of extraction that include key sequences of text in terms of clinical relevance (including patient/problem, intervention, and outcome). === Abstractive-based summarization === Abstractive summarization methods generate new text that did not exist in the original text. This has been applied mainly for text. Abstractive methods build an internal semantic representation of the original content (often called a language model), and then use this representation to create a summary that is closer to what a human might express. Abstraction may transform the extracted content by paraphrasing sections of the source document, to condense a text more strongly than extraction. Such transformation, however, is computationally much more challenging than extraction, involving both natural language processing and often a deep understanding of the domain of the original text in cases where the original document relates to a special field of knowledge. "Paraphrasing" is even more difficult to apply to images and videos, which is why most summarization systems are extractive. === Aided summarization === Approaches aimed at higher summarization quality rely on combined software and human effort. In Machine Aided Human Summarization, extractive techniques highlight candidate passages for inclusion (to which the human adds or removes text). In Human Aided Machine Summarization, a human post-processes software output, in the same way that one edits the output of automatic translation by Google Translate. == Applications and systems for summarization == There are broadly two types of extractive summarization tasks depending on what the summarization program focuses on. The first is generic summarization, which focuses on obtaining a generic summary or abstract of the collection (whether documents, or sets of images, or videos, news stories etc.). The second is query relevant summarization, sometimes called query-based summarization, which summarizes objects specific to a query. Summarization systems are able to create both query relevant text summaries and generic machine-generated summaries depending on what the user needs. An example of a summarization problem is document summarization, which attempts to automatically produce an abstract from a given document. Sometimes one might be interested in generating a summary from a single source document, while others can use multiple source documents (for example, a cluster of articles on the same topic). This problem is called multi-document summarization. A related application is summarizing news articles. Imagine a system, which automatically pulls together news articles on a given topic (from the web), and concisely represents the latest news as a summary. Image collection summarization is another application example of automatic summarization. It consists in selecting a representative set of images from a larger set of images. A summary in this context is useful to show the most representative images of results in an image collection exploration system. Video summarization is a related domain, where the system automatically creates a trailer of a long video. This also has applications in consumer or personal videos, where one might want to skip the boring or repetitive actions. Similarly, in surveillance videos, one would want to extract important and suspicious activity, while ignoring all the boring and redundant frames captured. At a very high level, summarization algorithms try to find subsets of objects (like set of sentences, or a set of images), which cover information of the entire set. This is also called the core-set. These algorithms model notions like diversity, coverage, information and representativeness of the summary. Query based summarization techniques, additionally model for relevance of the summary with the query. Some techniques and algorithms which naturally model summarization problems are TextRank and PageRank, Submodular set function, Determinantal point process, maximal marginal relevance (MMR) etc. === Keyphrase extraction === The task is the following. You are given a piece of text, such as a journal article, and you must produce a list of keywords or key[phrase]s that capture the primary topics discussed in the text. In the case of research articles, many authors provide manually assigned keywords, but most text lacks pre-existing keyphrases. For example, news articles rarely have keyphrases attached, but it would be useful to be able to automatically do so for a number of applications discussed below. Consider the example text from a news article: "The Army Corps of Engineers, rushing to meet President Bush's promise to protect New Orleans by the start of the 2006 hurricane season, installed defective flood-control pumps last year despite warnings from its own expert that the equipment would fail during a storm, according to documents obtained by The Associated Press". A keyphrase extractor might select "Army Corps of Engineers", "President Bush", "New Orleans", and "defective flood-control pumps" as keyphrases. These are pulled directly from the text. In contrast, an abstractive keyphrase system would somehow internalize the content and generate keyphrases that do not appear in the text, but more closely resemble what a human might produce, such as "political negligence" or "inadequate protection from floods". Abstraction requires a deep understanding of the text, which makes it difficult for a computer system. Keyphrases have many applications. They can enable document browsing by providing a short summary, improve information retrieval (if documents have keyphrases assigned, a user could search by keyphrase to produce more reliable hits than a full-text search), and be employed in generating index entries for a large text corpus. Depending on the different literature and the definition of key terms, words or phrases, keyword extraction is a highly related theme. ==== Supervised learning approaches ==== Beginning with the work of Turney, many researchers have approached keyphrase extraction as a supervised machine learning problem. Given a document, we construct an example for each unigram, bigram, and trigram found in the text (though other text units are also possible, as discussed below). We then compute various features describing each example (e.g., does the phrase begin with an upper-case letter?). We assume there are known keyphrases available for a set of training documents. Using the known keyphrases, we can assign positive or negative labels to the examples. Then we learn a classifier that can discriminate between positive and negative examples as a function of the features. Some classifiers make a binary classification for a test example, while others assign a probability of being a keyphrase. For ins

Asymmetric follow

An asymmetric follow social network is one which allows many people to follow an individual or account without having to follow them back. It is also known as asynchronous follow or sometimes asymmetric friendship. Asymmetric follow is a common pattern on Twitter, where someone may have thousands of followers, but themselves follow few (or no) accounts. In September 2010 Facebook started experimenting with a similar feature, which Facebook calls "Subscribe To."