Anderson's rule (computer science)

Anderson's rule (computer science)

In the field of computer security, Anderson's rule refers to a principle formulated by Ross J. Anderson: systems that handle sensitive personal information involve a trilemma of security, functionality, and scale, of which you can choose any two. A system that has information on many data subjects and to which many people require access is hard to secure unless its functionality is severely restricted. If it has rich functionality, you may have to restrict the number of people with access, or accept that some information will leak.

Anthem medical data breach

The Anthem medical data breach was a medical data breach of information held by Elevance Health, known at that time as Anthem Inc. On February 4, 2015, Anthem, Inc. disclosed that criminal hackers had broken into its servers and had potentially stolen over 37.5 million records that contain personally identifiable information from its servers. On February 24, 2015 Anthem raised the number to 78.8 million people whose personal information had been affected. According to Anthem, Inc., the data breach extended into multiple brands Anthem, Inc. uses to market its healthcare plans, including, Anthem Blue Cross, Anthem Blue Cross and Blue Shield, Blue Cross and Blue Shield of Georgia, Empire Blue Cross and Blue Shield, Amerigroup, Caremore, and UniCare. Healthlink says that it was also a victim. Anthem says users' medical information and financial data were not compromised. Anthem has offered free credit monitoring in the wake of the breach. Michael Daniel, chief adviser on cybersecurity for President Barack Obama, said he would be changing his own password. According to The New York Times, about 80 million company records were hacked, and there is a fear that the stolen data will be used for identity theft. The compromised information contained names, birthdays, medical IDs, social security numbers, street addresses, e-mail addresses and employment information, including income data. == Theft of the data == The data was stolen over a period of weeks the month before the data breach was discovered. Because no medical information was compromised, Anthem was not required by law to encrypt the data. However, Anthem faced several civil class-action lawsuits, which were settled in 2017 at a cost of $115 million. Anthem did not admit any wrongdoing in the settlement. Data from the attack is expected to be sold on the black market. == Impact == Persons whose data was stolen could have resulting problems about identity theft for the rest of their lives. Anthem had a US$100 million insurance policy for cyber problems from American International Group. One report suggested that all of this money could be consumed by the process of notifying customers of the breach. == Responses == Anthem hired Mandiant, a cybersecurity firm, to review their security systems and advised people whose data was stolen to monitor their accounts and remain vigilant. The theft of the data raised fears generally about the theft of medical information. A writer from Harvard Law School suggested that this data breach might spark reform of security practices and government data safety regulation. An investigation conducted by several state insurance commissioners blames the breach on an attacker whose identity was withheld, and claims that the breach was likely ordered by a foreign government whose name was withheld. It also concluded that Anthem had taken reasonable measures to protect its data before the breach and that its remediation plan was effective at shutting down the breach once it was discovered. It also marks the starting date of the breach as February 18, 2014. The lead investigator was the Indiana Department of Insurance (DOI) -- Anthem's principal regulator, because Anthem is headquartered in Indiana. The Indiana DOI hired independent auditors to conduct a security assessment at Anthem, which concluded, "While deficiencies within Anthem’s cybersecurity posture were noted by the Examination Team, these deficiencies were not, in our experience, uncommon to companies comparable to Anthem in size and scope. While the pre-breach deficiencies impacted Anthem’s ability to reduce the likelihood of and quickly detect the Data Breach, the controls implemented subsequent to the Data Breach should improve Anthem’s ability to detect future breaches and enable Anthem to respond more effectively to a future attack than was the case in this instance." Federal regulators also conducted an investigation of the Anthem data breach, resulting in a $16 million settlement between Anthem and the Department of Health and Human Services (HHS) -- by far the largest HHS data breach settlement. An HHS Director overseeing the investigation said, "The largest health data breach in U.S. history fully merits the largest HIPAA settlement in history. Unfortunately, Anthem failed to implement appropriate measures for detecting hackers who had gained access to their system to harvest passwords and steal people's private information." The HHS settlement also required Anthem to perform a risk assessment and correct any identified deficiencies in its cybersecurity, with HHS oversight of Anthem's progress. Approximately 100 private class action lawsuits were filed against Anthem over the data breach and consolidated in California federal court, in front of Judge Koh, a respected authority in data breach litigation. After contested briefing over who should lead the litigation efforts, Judge Koh appoints Eve Cervantez of Altshuler Berzon and Andy Friedman of Cohen Milstein as co-lead counsel, and appointed Eric Gibbs of Gibbs Law Group and Michael Sobel of Lieff Cabraser to head a Plaintiffs' Steering Committee. In 2017, Anthem agreed to settle the litigation for $115 million, the largest ever data breach settlement at the time. The attorneys requested $38 million in fees for their work on the case, but Judge Koh slashed the fee request, finding that only $31 million in fees were merited.

Operational database

Operational database management systems (also referred to as OLTP databases or online transaction processing databases), are used to update data in real-time. These types of databases allow users to do more than simply view archived data. Operational databases allow you to modify that data (add, change or delete data), doing it in real-time. OLTP databases provide transactions as main abstraction to guarantee data consistency that guarantee the so-called ACID properties. Basically, the consistency of the data is guaranteed in the case of failures and/or concurrent access to the data. == History == Since the early 1990s, the operational database software market has been largely taken over by SQL engines. In 2014, the operational DBMS market (formerly OLTP) was evolving dramatically, with new, innovative entrants and incumbents supporting the growing use of unstructured data and NoSQL DBMS engines, as well as XML databases and NewSQL databases. NoSQL databases typically have focused on scalability and have renounced to data consistency by not providing transactions as OLTP system do. Operational databases are increasingly supporting distributed database architecture that can leverage distribution to provide high availability and fault tolerance through replication and scale out ability. The growing role of operational databases in the IT industry is moving fast from legacy databases to real-time operational databases capable to handle distributed web and mobile demand and to address Big data challenges. Recognizing this, Gartner started to publish the Magic Quadrant for Operational Database Management Systems in October 2013. == List of operational databases == Notable operational databases include: == Use in business == Operational databases are used to store, manage and track real-time business information. For example, a company might have an operational database used to track warehouse/stock quantities. As customers order products from an online web store, an operational database can be used to keep track of how many items have been sold and when the company will need to reorder stock. An operational database stores information about the activities of an organization, for example customer relationship management transactions or financial operations, in a computer database. Operational databases allow a business to enter, gather, and retrieve large quantities of specific information, such as company legal data, financial data, call data records, personal employee information, sales data, customer data, data on assets and many other information. An important feature of storing information in an operational database is the ability to share information across the company and over the Internet. Operational databases can be used to manage mission-critical business data, to monitor activities, to audit suspicious transactions, or to review the history of dealings with a particular customer. They can also be part of the actual process of making and fulfilling a purchase, for example in e-commerce. == Data warehouse terminology == In data warehousing, the term is even more specific: the operational database is the one which is accessed by an operational system (for example a customer-facing website or the application used by the customer service department) to carry out regular operations of an organization. Operational databases usually use an online transaction processing database which is optimized for faster transaction processing (create, read, update and delete operations). An operational database is the source for a data warehouse. Data from an operational database can be loaded into an operational data store at a data warehouse before the data is processed into the data warehouse.

Transliteracy

Transliteracy is "a fluidity of movement across a range of technologies, media and contexts". It is an ability to use diverse techniques to collaborate across different social groups. Transliteracy combines a range of capabilities required to move across a range of contexts, media, technologies and genres. Conceptually, transliteracy is situated across five capabilities: information capabilities (see information literacy), ICT (information and communication technologies), communication and collaboration, creativity and critical thinking. It is underpinned by literacy and numeracy. (See figure below) The concept of transliteracy is impacting the system of education and libraries. == History == While the term appears to come from the prefix trans- ('across') and the word literacy, the scholars who coined it say they developed it from the practice of transliteration, which means to use the letters of one language to write down a different language. The study of transliteracy was first developed in 2005 by the Transliteracies Research Project, directed by University of California at Santa Barbara Professor Alan Liu. The concept of 'transliteracies' was developed as part of research into online reading. It was shared and refined at the Transliteracies conference, held at UC Santa Barbara in 2005. The conference inspired the at the time De Montfort University Professor, Sue Thomas, to create the Production in Research and Transliteracy (PART) group, which evolved into the Transliteracy Research Group. The current meaning of transliteracy was defined in the group's seminal paper Transliteracy: crossing divides as "the ability to read, write, and interact across a range of platforms, tools, and media from signing and orality through handwriting, print, TV, radio, and film, to digital social networks." The concept was enthusiastically adopted by a number of professional groups, notably in the library and information field. Transliteracy Research Group Archive 2006–2013 curates numerous resources from this period. For a number of years, there was a gap between significant interest in transliteracy among professional groups and the scarcity of research. A group of academics from the University of Bordeaux considered transliteracy mainly in the school context. Freelance writer and consultant, Sue Thomas, studied transliteracy and creativity, while Suzana Sukovic, executive director of educational research and evidence-based practice at HETI, researched transliteracy in relation to digital storytelling. The first book on the topic, Transliteracy in complex information environment by Sukovic, is based on research and experience with practice-based projects. == Transliteracy in education == Transliteracy is making an impact on the classroom setting because of how technologically advanced younger generations are today. In 2012, Adam Marcus, a teacher and librarian at the New York City Department of Education (NYCDOE), decided to incorporate transliteracy into his school's public library summer reading program. He had a desire to enhance the experience of reading for his students by allowing them to connect to the text differently by using social media. He used a tool called VoiceThread in order to have his students "take part in conversations, formulate ideas, and share higher-order thinking through a variety of media channels: video, audio, text, images, and music". Students were also enabled to communicate with the book's author through blogs and websites, and were given multiple modes of media to comprehend and engage with the text on a deeper level. Some of these examples include an audio-video glossary and web links that aimed to bring the details of the text to life. The results of his experiment were deemed to have a positive effect on the program as students responded well to this interactive experience they were given. Marcus believes that it is important for educators and librarians to enhance storytelling for children by providing them with a modern and transliterate experience that one could not receive back then. The Agence nationale de la recherche funded a program at a French high school from 2013 to 2015, where the transliteracy skills of students were tested and observed. Students were placed in groups of three or four members and were required to use all sorts of media and tools in order to collect data for their projects. They were not allowed to only use digital sources, and were advised to use a diversity of sources. The focus of this experiment was to observe "the possible diversity of media and tools employed, on the ways of and reasons for switching from one to another, on how these different media and tools are distributed within contexts, according to the academic requirements and tasks individually and collectively performed by the students." The conclusions of the experiment dealt with physical space and organization being an issue for students and teachers to deal with. Spatially, it was challenging for students to navigate through different mediums when their space inside the classroom was limited. It was noticed that students were prone to use something that took up less space, rather than focusing on expanding their diversity of sources. Organizationally, it was challenging for students to organize all of the information they collected since everything was not being search and collected for digitally. In addition, students were not allotted a lot of time to complete their projects which also impacted their final product. == Transliteracy in libraries == In 2009, Dr. Susie Andretta, senior lecturer in Information Management at London Metropolitan University, conducted interviews with four different information professionals including an academic librarian, an outreach librarian, a content manager, and a scholar within the library science and information discipline. She was aiming to explore how transliteracy was colliding and combining with the print-world of libraries. Dr. Andretta defines transliteracy as "an umbrella term encompassing different literacies and multiple communication channels that require active participation with and across a range of platforms, and embracing both linear and non-linear messages (3)." The goals of these interviews ranged from the following: to test the information professional's awareness of transliteracy, to have them identify transliteracy and how it is integrated into their work, and to explain the impact transliteracy has had on they library they work at. Andretta found that out of all the information professionals interviewed, it was only the academic librarian who was vaguely familiar with the concept of transliteracy. Bernadette Daly Swanson, an Academic Librarian at UC Davis, expresses in her interview with Dr. Andretta how she would "like to think that the transliterate library is more of an environment where we do different things [...] I would take maybe about a third of the first floor of our library and transform it into a lab [...] where we can start to evolve [..] explore, and experiment in media development, content development, and do it not just with librarians; so open up the space for other people [...] so you don't get people working in isolation." Although the other three candidates that Dr. Andretta interviewed had not heard of the term transliteracy, they responded well to the concept once it was explained to them and agreed with its impact on the workplace. Dr. Michael Stephens, an assistant professor in the Graduate School of Library and Information Science at Dominican University, explains in his interview how the term transliteracy describes the courses he teaches on libraries and Web 2.0 technologies. Dr. Stephens states that students being educated in Web 2.0 technologies gives them "the opportunity to experience what the channel can be and the potential for that sharing learning, for asking questions, just for out loud thinking – I think it's incredibly valuable. [..] this is where this wonderful concept comes in, it was teaching them transliteracy and the fact that they can move across channels without getting worried about it." Dr. Andretta concluded from her interviews how although transliteracy may not be a very well-known term yet, it has nonetheless established itself into the intuition of libraries while also transforming the traditional library to a world of enhanced and expanded services. "Inherent in this transition are the challenges of having to adapt to a constantly changing technological landscape, the multiple literacies that this generates, and the need to establish a multifaceted library profession that can speak the multiple-media languages of its diverse users." Thomas Ipri, a librarian at the University of Nevada, advocates for libraries needing to make a change in their literary functions. He argues that the divide between digital and print makes it harder for libraries to accommodate their patrons and to share information. He f

Maritime Informatics

Maritime Informatics is a thematic topic within the broader discipline of informatics. It can be considered as both a field of study and domain of application. As an application domain, it is the outlet of innovations originating from data science and artificial intelligence; as a field of study, it is positioned between computer science and marine engineering. == Beginnings of maritime informatics == As a result of the increasing levels of digitalisation occurring in the maritime sector starting around 2010 and stimulated by the EU-endorsed MonaLisa project for sea traffic management (STM), a number of academics and shipping industry leaders recognised that the maritime transportation sector would benefit from a specific field of study and application to be known as Maritime Informatics - the use of information systems, data sharing and data analytics in the business and operations of maritime transportation. They considered that it would lead to improvements in efficiency, safety, resilience, and ecological sustainability - all of which are currently lacking for many aspects of sea transport. One of the first public airings of the concept of Maritime Informatics was a presentation delivered on 11 September 2014 in Gothenburg, Sweden. A proposal for an inaugural minitrack on Maritime Informatics was accepted for the 2015 Americas Conference on Information Systems in Puerto Rico where three papers were presented. Since then numerous publications has been brought forward captured at www.maritimeinformatics.org and in late 2020 the first reference book on Maritime Informatics was co-written by 81 expert contributors (47 practitioners and 34 researchers) from 20 countries. Most impactful authors and journals in the domain have been documented in a review paper. Dimitrios Zissis, Luca Cazzanti and Leonardo M. Millefiori are the top three authors; top journals and conferences include Ocean Engineering, Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems, Sensors, the international Conference On Engineering, Technology And Innovation, Expert Systems With Applications, IEEE Access, and Journal of Navigation. == Background == The shipping industry has several particular organisational aspects that are recognised and taken into account in maritime informatics: It is predominantly a self-organising ecosystem Many activities are undertaken as part of episodic tight coupling There is a so-called maritime stack There is increasing pressure to balance capital productivity and energy efficiency There is the potential virtuous interplay between different types of systems == Data sharing == Digital data sharing is key to the all-important, arguably fundamental, data analytics aspects of maritime informatics because it opens the way for better access to relevant and reliable data. As in land-based commerce, digital data sharing is a growing phenomenon in maritime operations - though there is a way to go. It is enabling greater transparency for all those involved in the transportation of goods and passengers, not least being the end-customer. This leads to better and more informed decision-making and planning by all those involved. The push for digitalisation and data sharing is being pursued both by governments and the commercial sector. For example, the Member States of the IMO agreed a mandatory requirement for their governments to introduce electronic information exchange between ships and ports as from 8 April 2019. Meanwhile, commercial operators, particularly in the container lines are putting systems in place for sharing data for mutual benefit in their operations. Data sharing is an important aspect of the Port Collaborative Decision Making (PortCDM) and Port Call Optimization initiatives, both of which seek to improve the coordination, synchronization and efficiency of the port call process by enabling a common and shared situational awareness among all those involved. == Standardisation == The availability and sharing of relevant digital data underpins maritime informatics and is key to more effective and efficient coordination and synchronisation in the predominantly self-organising ecosystem that is maritime transportation. For this to occur, a high priority underpinning maritime informatics is the encouragement of standardised digital data exchange and data sharing, leading, in turn, to improvements in shipping analytics. Improved availability of data will support better historical analysis, now-casting and forecasting. The International Maritime Organization (IMO) FAL Committee is taking the lead in ensuring that the common terms used in the various standards being developed or in use in the maritime sector are compatible and therefore interoperable as far as is practicable, by creating and maintaining The IMO Compendium on Facilitation and Electronic Business. The IMO Compendium consists of an IMO Data Set and IMO Reference Data Model agreed by the main organisations involved in the development of standards for the electronic exchange of information related to the FAL Convention: the World Customs Organization (WCO), the United Nations Economic Commission for Europe (UNECE) and the International Organization for Standardization (ISO). There are several other prominent international governmental and non-governmental organisations actively contributing to the ongoing standardisation and harmonisation process including the UN Electronic Data Interchange for Administration, Commerce and Transport (UN EDIFACT), the Digital Container Shipping Association (DCSA), the International Harbour Masters Association (IHMA) and BIMCO - the world's largest direct-membership organisation for shipowners, charterers, shipbrokers and agents.

Termcap

Termcap (terminal capability) is a legacy software library and database used on Unix-like computers that enables programs to use display computer terminals in a terminal-independent manner, which greatly simplifies the process of writing portable text mode applications. It was superseded by the terminfo database used by ncurses, tput, and other programs. A termcap database can describe the capabilities of hundreds of different display terminals. This allows programs to have character-based display output, independent of the type of terminal. On-screen text editors such as vi and Emacs are examples of programs that may use termcap. Other programs are listed in the Termcap category. Access to the termcap database was usually provided by separate libraries, e.g. GNU Termcap. Examples of what the database describes: how many columns wide the display is what string to send to move the cursor to an arbitrary position (including how to encode the row and column numbers) how to scroll the screen up one or several lines how much padding is needed for such a scrolling operation. == History == Bill Joy wrote the first termcap library in 1978 for the Berkeley Unix operating system; it has since been ported to most Unix and Unix-like environments, even OS-9. Joy's design was reportedly influenced by the design of the terminal data store in the earlier Incompatible Timesharing System. == Data model == Termcap databases consist of one or more descriptions of terminals. === Indices === Each description must contain the canonical name of the terminal. It may also contain one or more aliases for the name of the terminal. The canonical name or aliases are the keys by which the library searches the termcap database. === Data values === The description contains one or more capabilities, which have conventional names. The capabilities are typed: boolean, numeric and string. The termcap library has no predetermined type for each capability name. It determines the types of each capability by the syntax: string capabilities have an "=" between the capability name and its value, numeric capabilities have a "#" between the capability name and its value, and boolean capabilities have no associated value (they are always true if specified). Applications which use termcap do expect specific types for the commonly used capabilities, and obtain the values of capabilities from the termcap database using library calls that return successfully only when the database contents matches the assumed type. === Hierarchy === Termcap descriptions can be constructed by including the contents of one description in another, suppressing capabilities from the included description or overriding or adding capabilities. No matter what storage model is used, the termcap library constructs the terminal description from the requested description, including, suppressing or overriding at the time of the request. == Storage model == Termcap data is stored as text, making it simple to modify. The text can be retrieved by the termcap library from files or environment variables. === Environment variables === The TERM environment variable contains the terminal type name. The TERMCAP environment variable may contain a termcap database. It is most often used to store a single termcap description, set by a terminal emulator to provide the terminal's characteristics to the shell and dependent programs. The TERMPATH environment variable is supported by newer termcap implementations and defines a search path for termcap files. === Flat file === The original (and most common) implementation of the termcap library retrieves data from a flat text file. Searching a large termcap file, e.g., 500 kB, can be slow. To aid performance, a utility such as reorder is used to put the most frequently used entries near the beginning of the file. === Hashed database === 4.4BSD based implementations of termcap store the terminal description in a hashed database (e.g., something like Berkeley DB version 1.85). These store two types of records: aliases which point to the canonical entry, and the canonical entry itself. The text of the termcap entry is stored literally. == Limitations and extensions == The original termcap implementation was designed to use little memory: the first name is two characters, to fit in 16 bits capability names are two characters descriptions are limited to 1023 characters. only one termcap entry with its definitions can be included, and must be at the end. Newer implementations of the termcap interface generally do not require the two-character name at the beginning of the entry. Capability names are still two characters in all implementations. The tgetent function used to read the terminal description uses a buffer whose size must be large enough for the data, and is assumed to be 1024 characters. Newer implementations of the termcap interface may relax this constraint by allowing a null pointer in place of the fixed buffer, or by hiding the data which would not fit, e.g., via the ZZ capability in NetBSD termcap. The terminfo library interface also emulates the termcap interface, and does not actually use the fixed-size buffer. The terminfo library's emulation of termcap allows multiple other entries to be included without restricting the position. A few other newer implementations of the termcap library may also provide this ability, though it is not well documented. == Obsolete features == A special capability, the "hz" capability, was defined specifically to support the Hazeltine 1500 terminal, which had the unfortunate characteristic of using the ASCII tilde character ('~') as a control sequence introducer. In order to support that terminal, not only did code that used the database have to know about using the tilde to introduce certain control sequences, but it also had to know to substitute another printable character for any tildes in the displayed text, since a tilde in the text would be interpreted by the terminal as the start of a control sequence, resulting in missing text and screen garbling. Additionally, attribute markers (such as start and end of underlining) themselves took up space on the screen. Comments in the database source code often referred to this as "Hazeltine braindamage". Since the Hazeltine 1500 was a widely used terminal in the late 1970s, it was important for applications to be able to deal with its limitations.

Leiden algorithm

The Leiden algorithm is a community detection algorithm developed by Traag et al at Leiden University. It was developed as a modification of the Louvain method. Like the Louvain method, the Leiden algorithm attempts to optimize modularity in extracting communities from networks; however, it addresses key issues present in the Louvain method, namely poorly connected communities and the resolution limit of modularity. == Improvement over Louvain method == Broadly, the Leiden algorithm uses the same two primary phases as the Louvain algorithm: a local node moving step (though, the method by which nodes are considered in Leiden is more efficient) and a graph aggregation step. However, to address the issues with poorly-connected communities and the merging of smaller communities into larger communities (the resolution limit of modularity), the Leiden algorithm employs an intermediate refinement phase in which communities may be split to guarantee that all communities are well-connected. Consider, for example, the following graph: Three communities are present in this graph (each color represents a community). Additionally, the center "bridge" node (represented with an extra circle) is a member of the community represented by blue nodes. Now consider the result of a node-moving step which merges the communities denoted by red and green nodes into a single community (as the two communities are highly connected): Notably, the center "bridge" node is now a member of the larger red community after node moving occurs (due to the greedy nature of the local node moving algorithm). In the Louvain method, such a merging would be followed immediately by the graph aggregation phase. However, this causes a disconnection between two different sections of the community represented by blue nodes. In the Leiden algorithm, the graph is instead refined: The Leiden algorithm's refinement step ensures that the center "bridge" node is kept in the blue community to ensure that it remains intact and connected, despite the potential improvement in modularity from adding the center "bridge" node to the red community. == Graph components == Before defining the Leiden algorithm, it will be helpful to define some of the components of a graph. === Vertices and edges === A graph is composed of vertices (nodes) and edges. Each edge is connected to two vertices, and each vertex may be connected to zero or more edges. Edges are typically represented by straight lines, while nodes are represented by circles or points. In set notation, let V {\displaystyle V} be the set of vertices, and E {\displaystyle E} be the set of edges: V := { v 1 , v 2 , … , v n } E := { e i j , e i k , … , e k l } {\displaystyle {\begin{aligned}V&:=\{v_{1},v_{2},\dots ,v_{n}\}\\E&:=\{e_{ij},e_{ik},\dots ,e_{kl}\}\end{aligned}}} where e i j {\displaystyle e_{ij}} is the directed edge from vertex v i {\displaystyle v_{i}} to vertex v j {\displaystyle v_{j}} . We can also write this as an ordered pair: e i j := ( v i , v j ) {\displaystyle {\begin{aligned}e_{ij}&:=(v_{i},v_{j})\end{aligned}}} === Community === A community is a unique set of nodes: C i ⊆ V C i ⋂ C j = ∅ ∀ i ≠ j {\displaystyle {\begin{aligned}C_{i}&\subseteq V\\C_{i}&\bigcap C_{j}=\emptyset ~\forall ~i\neq j\end{aligned}}} and the union of all communities must be the total set of vertices: V = ⋃ i = 1 C i {\displaystyle {\begin{aligned}V&=\bigcup _{i=1}C_{i}\end{aligned}}} === Partition === A partition is the set of all communities: P = { C 1 , C 2 , … , C n } {\displaystyle {\begin{aligned}{\mathcal {P}}&=\{C_{1},C_{2},\dots ,C_{n}\}\end{aligned}}} == Partition quality == How communities are partitioned is an integral part on the Leiden algorithm. How partitions are decided can depend on how their quality is measured. Additionally, many of these metrics contain parameters of their own that can change the outcome of their communities. === Modularity === Modularity is a highly used quality metric for assessing how well a set of communities partition a graph. The equation for this metric is defined for an adjacency matrix, A, as: Q = 1 2 m ∑ i j ( A i j − k i k j 2 m ) δ ( c i , c j ) {\displaystyle Q={\frac {1}{2m}}\sum _{ij}(A_{ij}-{\frac {k_{i}k_{j}}{2m}})\delta (c_{i},c_{j})} where: A i j {\displaystyle A_{ij}} represents the edge weight between nodes i {\displaystyle i} and j {\displaystyle j} ; see Adjacency matrix; k i {\displaystyle k_{i}} and k j {\displaystyle k_{j}} are the sum of the weights of the edges attached to nodes i {\displaystyle i} and j {\displaystyle j} , respectively; m {\displaystyle m} is the sum of all of the edge weights in the graph; c i {\displaystyle c_{i}} and c j {\displaystyle c_{j}} are the communities to which the nodes i {\displaystyle i} and j {\displaystyle j} belong; and δ {\displaystyle \delta } is Kronecker delta function: δ ( c i , c j ) = { 1 if c i and c j are the same community 0 otherwise {\displaystyle {\begin{aligned}\delta (c_{i},c_{j})&={\begin{cases}1&{\text{if }}c_{i}{\text{ and }}c_{j}{\text{ are the same community}}\\0&{\text{otherwise}}\end{cases}}\end{aligned}}} === Reichardt Bornholdt Potts Model (RB) === One of the most well used metrics for the Leiden algorithm is the Reichardt Bornholdt Potts Model (RB). This model is used by default in most mainstream Leiden algorithm libraries under the name RBConfigurationVertexPartition. This model introduces a resolution parameter γ {\displaystyle \gamma } and is highly similar to the equation for modularity. This model is defined by the following quality function for an adjacency matrix, A, as: Q = ∑ i j ( A i j − γ k i k j 2 m ) δ ( c i , c j ) {\displaystyle Q=\sum _{ij}(A_{ij}-\gamma {\frac {k_{i}k_{j}}{2m}})\delta (c_{i},c_{j})} where: γ {\displaystyle \gamma } represents a linear resolution parameter === Constant Potts Model (CPM) === Another metric similar to RB is the Constant Potts Model (CPM). This metric also relies on a resolution parameter γ {\displaystyle \gamma } The quality function is defined as: H = − ∑ i j ( A i j w i j − γ ) δ ( c i , c j ) {\displaystyle H=-\sum _{ij}(A_{ij}w_{ij}-\gamma )\delta (c_{i},c_{j})} === Understanding Potts Model resolution parameters/Resolution limit === Typically Potts models such as RB or CPM include a resolution parameter in their calculation. Potts models are introduced as a response to the resolution limit problem that is present in modularity maximization based community detection. The resolution limit problem is that, for some graphs, maximizing modularity may cause substructures of a graph to merge and become a single community and thus smaller structures are lost. These resolution parameters allow modularity adjacent methods to be modified to suit the requirements of the user applying the Leiden algorithm to account for small substructures at a certain granularity. The figure on the right illustrates why resolution can be a helpful parameter when using modularity based quality metrics. In the first graph, modularity only captures the large scale structures of the graph; however, in the second example, a more granular quality metric could potentially detect all substructures in a graph. == Algorithm == The Leiden algorithm starts with a graph of disorganized nodes (a) and sorts it by partitioning them to maximize modularity (the difference in quality between the generated partition and a hypothetical randomized partition of communities). The method it uses is similar to the Louvain algorithm, except that after moving each node it also considers that node's neighbors that are not already in the community it was placed in. This process results in our first partition (b), also referred to as P {\displaystyle {\mathcal {P}}} . Then the algorithm refines this partition by first placing each node into its own individual community and then moving them from one community to another to maximize modularity. It does this iteratively until each node has been visited and moved, and each community has been refined - this creates partition (c), which is the initial partition of P refined {\displaystyle {\mathcal {P}}_{\text{refined}}} . Then an aggregate network (d) is created by turning each community into a node. P refined {\displaystyle {\mathcal {P}}_{\text{refined}}} is used as the basis for the aggregate network while P {\displaystyle {\mathcal {P}}} is used to create its initial partition. Because we use the original partition P {\displaystyle {\mathcal {P}}} in this step, we must retain it so that it can be used in future iterations. These steps together form the first iteration of the algorithm. In subsequent iterations, the nodes of the aggregate network (which each represent a community) are once again placed into their own individual communities and then sorted according to modularity to form a new P refined {\displaystyle {\mathcal {P}}_{\text{refined}}} , forming (e) in the above graphic. In the case depicted by the graph, the nodes were already sorted optimally, so no change too