Tag Archives: Fred

HTTP Cookies

HTTP Cookies

Overview

The HTTP protocol is stateless; it mandates that a web server must respond to a client’s request without relating the request to previous or subsequent requests.¹ Despite the statelessness dictated by the HTTP protocol, websites are aggressively tracking and canvassing their clients.² HTTP cookies are the mechanism around the HTTP protocol’s statelessness; a cookie acts as container to store and send information about a specific client ‘s internet browsing patterns and preferences; this client information is sent back to a website/web server so that a web server can tailor its response to individual client.

What resources are being used?
Internet cookies typically store a client’s username for a particular website or some other string of data that can uniquely identify the client. Additionally, each cookie stores the name of the domain/website its associated with, the path on the web server the for which the cookie applies and expiration date for the cookie.

Why are the resources are being used
A previously mentioned, the stateless nature of the HTTP protocol prevents a web server from knowing the identity of the client making requests until the user authenticates/logins into a the web servrer; without some knowledge of the client’s identity, the web server is unable to accommodate user preferences or serve up tailored content to the user. Nonetheless, there are many situations where it is advantageous for a web server to know the identity of the client so that it can tailor the delivery of content to user preferences. For instance, when a client first goes to amazon.com to do some shopping, its advantageous for Amazon.com to immediately serve the client content about products that the client has viewed in the past or other products that the website thinks that the client maybe interested in, without requiring the client to sign in/authenticate first. Signing into a web site/authenticating requires effort on the client’s part; Amazon needs to make the shopping experience as interesting and easy as possible to increase the likelihood that that the client will be willing to buy something. To have a future way of seamlessly identifying a given client, the web server stores data about the clients identity inside a cookie and then stores the cookie inside the client’s browser. Upon the client re-visiting the website, the web server identifies the client by retrieving the cookie data that it previously stored in the client’s web browser.

How much are the resources organized?

Cookies have an specific and limited scope. As mandated by Internet Engineering Task Force, cookies can only store 4kb of data, about 40 characters. Cookies have a specific and limited scope because they serve a specific function when merged with other web server resources; a cookie’s explicit purpose is to help the web server identify a specific client. Once the web server extracts the data inside the client’s website cookie, the web server matches the data to the web site’s database/and or maps the cookie data onto the web site’s internal programming functions in the web server. Upon successfully matching the cookie data, the cookie data will trigger the appropriate web server response.

When are the resources organized?

As previously alluded to earlier, when a user visits a web site, the web server of the website will search for the web site’s cookie in the user’s web browser/hard disk. If the cookie does not exist, then, the web server will store/set data about the client inside a cookie as a file. Additionally, when a web server does find a website cookie, in many cases, it will preform organizational maintenance activities on the cookie by update/modify the cookie data.

Who does the organizing activities?

All organizing activities surrounding cookies are conducted the website/web server; it is completely responsible for organizing, accessing, and retrieving cookie data. The client is unaware of/oblivious to the organizing activities surrounding cookies that the website/web server is conducting.

Other considerations

Using cookies is an unreliable way for a web server to identify a givdn client. At any point, the client has the option and ability to delete any or all the cookie files from his or her web browser. Many clients frequently delete cookies from their browsers over privacy concerns; many client’s don’t want to be tracked by a website. Additionally, when a web server attempts to create a cookie file in a client’s web browser/hard disk, the client’s browser has the option to reject the incoming cookie.

1Hudson, Paul, PHP in a Nutshell, (O’Reilly Media, Inc., 2009), 170.

2Auerbach, David, “You Are What You Click”, Nation 296, Vol. 9 (March 2013): 30.

Assignment 10

Overview

I have chosen an enterprise data warehouse as the organizing system that I would like to focus on. A mid to large sized corporation has many types of immovable, movable infrastructure like office buildings, office furniture, transport buses and equipment. Out of these, the type of resource that is the most prized and dynamic is IT infrastructure. Servers, phone lines, network routers, laptops, PC’s, proxy networks are ubiquitous in an information-centric company. Managing these resources in the most efficient way possible is a primary goal of the management of a company. This is because the more efficiently these resources are handled, the less of a cost center IT equipment becomes. It can be seen as a profit center instead of a cost center. IT infrastructure is a very important resource that needs to be organized logically so that we can support interactions (like adding a resource, removing a resource etc.) efficiently without affects any dependent processes.

What resources are being used?

The resources that are being used in a company are generally the hardware and software components owned by the company. They can be a server, a network switch, a laptop or an application. These resources support the activities of a company for both its internal customers as well its external customers. But for our organizing system, which is a data warehouse, the resources are information components which have data related to these physical objects. This information can be about when a server was added to a network, when it was upgraded, when an issue was reported about the hardware and how much time it took to resolve it. The format of the source data fed into the data warehouse is an important distinction for the ‘Extract’ process of a data warehouse. The data can be received in the form of flat files (plain text), or from relational databases. Each format requires special technologies and handling procedures, like UNIX shell scripts for flat files and database procedures for relational data. Although all the information received are primary resources, they link to each other and hence form description resources for each other. An example would be, the ticket that was raised to replace a server was linked to a ticket that was raised to report a server crashing. The two information resources in themselves are primary resources, each containing the details when the ticket was raised, who it was assigned to and when it was resolved, but by linking them together they form metadata for each other. This is a kind of shared information component network that the data warehouse tries to catch. Depending on the focus of our resource, we can either have the same record as a primary or description resource. The resources primarily provide information about physical resources but they also capture information about the processes related to the these physical resources.

The resources are grouped together depending on the process or physical object they represent. Like information about a server or information about adding a new sever. Although they might be related to the same server, they clearly have different domains and different types of information. As a result, they will be treated as different resources.

The resources are names as per the domain they belong to. Like information about incidents raised for a equipment can be clubbed as an Incident entity. All change-related information can be clubbed as Change entity. The naming for these entities in the data warehouse conforms to a controlled vocabulary and a fixed syntax. Like a ‘server change-related information’ can be referred to a CHG. The corresponding table in the data warehouse would have a name with CHG as a suffix. Having a primary key for a server incident-related record as ‘12345-INC’ can ensure there can be no collisions with another record in a different domain. This qualified name also makes the identifier more informative.

Why are the resources organized?

The information resources are being organized in order to facilitate easy reporting. When every resource is intrinsically linked to other resources, getting the bigger picture can be a problem for mid-level and senior-level managers who are responsible for gauging the effectiveness of a new strategy and making required corrections. A person who manages a team responsible for resolving server incidents would like to know how many issues were resolved by his/her team, are they effective, should he/she get more people in her team, should he/she make changes to the way that people are assigned issues to work on. A single go-to point which can help answer all these questions is an operational necessity. The faster this organizing system can answer his/her questions, the more relevant the data warehouse becomes. That ultimately depends on the types of interactions the system supports, like having a canned report with trend graphs or exporting raw data so that it can be processed by another tool.

How much are the resources organized?

The user requirements decide how much the resources are organized. The effectivity of captured information is decided by the end users. Would they like to preserve data that was loaded to the data warehouse more than a year ago? Or are they only interested in last 2 months of data. If each user’s requirement is considered a set, then the union of all the sets of all possible users establishes the minimum amount of a data that needs to be maintained. Also, the level of granularity of data is again dependent on user requirements. If they are just interested in looking at a high-level summarized view of data, we need not preserve raw data pertaining to each and every ticket in the system. In this case the interaction is purely based on a collection level property. There could be a new entity that will be reported as a standalone process without any relationships with other processes. In that case, if there is no reporting requirement, it makes sense to just add that resource to the system without explicitly modelling the relationships that it actually has with other entities. If a star database schema is sufficient to satisfy user requirements, there would be no need to have a snow-flake data model which is inherently more complex and difficult to maintain. There could be cases where two separate domains always go together. Like ‘Incident’ information and “Rootcause” information. Every incident would have a root cause. But should we introduce these as separate entities or we can club them together into a single ‘Incident’ domain. This is a ‘one-one’ relationship which can easily be housed in a single table. I have noticed that this question is answered depending on the performance implications of having too many resources in the system as joining them and rendering them in a report would be time consuming. So wherever possible, collocation (denormalization) would be make better sense than unnecessarily creating too many resource categories.

When are the resources organized?

The resources are organized depending on the level of time granularity the user wants. In this specific case, the manager might want to look at the number of incidents resolved on a day-day basis. If the application or group they are supporting is critical, day to day monitoring makes sense. In this case any dip in the productivity of the team can be quickly caught and remedial action can be taken. The frequency of the data warehouse loads reflects this priority.

Who does the organizing?

Automated processes designed to Extract, Transform and Load data do the organizing on a daily basis. The designer of these processes, the data architect, creates the organizing system principles (data model) after interacting with the end users and inquiring about their requirements and specific needs. These requirements are almost always related to what kind of data the users would like to see together, in a single pie chart or in a single table. Understanding the relationships between the resources and maintain the ability and efficiency of the system are the key concerns that the organizer has.

Other considerations

Readability of the code and documentation matters a lot while creating a data warehouse. In this case, the person or team creating the organizing system moves on once the system has been created and a support team would maintain the system. If the design is too complex, or if the various processes were not documented properly, it would lead to a lot of issues in supporting the system and making modifications to existing functionality. The organizing principles need to be clearly enunciated so that new interactions can be added without affecting any existing functionality.

Organizing Digital Tumors

OVERVIEW: After months of reading the scholarly literature, attending lectures, and ruminating over the associative trails, a biologist might suddenly think of an idea that they wish to test. There are many methods of testing a fledgling hypothesis; some biologists might take to the lab, while others might pack for a long trip in the wilderness.

Today, due to the growing abundance of digital data, and the encouragement of open collaboration, some biologists have another option. Many researchers now upload their raw and processed datasets for community use. This is particularly relevant and useful for cancer biologists, who can use available genome sequences to answer research questions about a large sample of individuals, without expanding as many resources to hand collect the data.

In this case study, I have chosen The Cancer Genome Atlas (TCGA), an online repository created through a joint effort between the National Cancer Institute, NHGRI, and certain divisions of NIH. To test a hypothesis about cancer, a biologist, like the one in this case study, might turn to the TCGA.

WHAT RESOURCES ARE BEING ORGANIZED?:

An online tumor data file is born digital, but acquires its identity from physical objects. Samples from the physical tissue are extracted, and then passed through a machine that sequences fragments of DNA. The process of sequencing returns many small fragments of DNA, which are represented as digital files. Different levels of processing and granularity will yield different types of files. In some cases, a whole genome sequence may be considered a resource, while another resource might consist of only a few mutations pulled from that sequence.

These resources and their associated metadata serve to represent and model the physical tumor. Even the process of creating the tumor sequence is largely intention driven, rather than an exact representation of the tumor. After the sequencer has output the DNA fragments, an unaligned genome resembles a puzzle, in which many pieces are duplicated, and many have errors. Computational methods using physical similarity and statistics play a role in creating and refining resources, as does the human biologist’s intuition.

WHY ARE THE RESOURCES ORGANIZED?:

Online tumor datasets are organized to facilitate easy retrieval for analysis. TCGA resources are arranged in a faceted classification system. This enables users to sort and retrieve based on multiple properties. Some users might be only interested in a certain type of data, such as clinical data, while others might wish to examine all types of data for a particular cancer type, such as ovarian. The faceted classification system enables yet a third user to look at ovarian clinical data.

Certain traits, closely resembling attributes, further enable precision in retrieval. For example, the user searching for ovarian cancer, might also search for Stage III as opposed to stage II ovarian cancer. As in many organizing systems, the imposition of categories and hierarchy often reveals patterns in resources, which can alter how users think about the resources. A glance at TCGA’s scheme demonstrates that the histology and anatomical origins of a tumor continue to drive how we analyze cancer.

HOW MUCH ARE THE RESOURCES ORGANIZED?:

Resources are organized at multiple levels at varying levels of granularity. They are almost always organized by intrinsic static properties; age of files does not matter as much as the content, which should remain unchanged over time. Some of the organization draws from conventional medical standards that are used in many other domains.

Other levels of organization are based on standards drafted by TCGA or other organizations. An example of this can be seen in the TCGA defined .maf file, which contains information about mutations. Each mutation has several required fields, such as position and allele, and each field must adhere to a predefined format. Presumably, if a resource is not formatted correctly, a researcher will not be allowed to upload it, and so this control is fairly strict. This is with good reason, as poor or inconsistent organization within files will cause problems when computational analysis is run.

WHEN ARE THE RESOURCES ORGANIZED?

Resources are organized as they are added to the collection. When the site was first created, the staff of TCGA defined the categories and hierarchies of the organizing system. Currently, unaffiliated researchers deposit the resources into this preset system. The timing of resource addition to the collection usually coincides with the timing of a related publication. Researchers will likely only deposit their data after a study is finished, to minimize chances of a competing publication.

WHO DOES THE ORGANIZING?

Organizing is a joint effort between the researchers and organizations that collect the data, and those who curate the TCGA collection. TCGA maintains certain standards for formatting the different resource types and defines the overall structure into which resources are deposited. However, processing each tumor is also an organizing step. At the very lowest levels, when researchers “create” the resources, they must make some of the messier decisions, such as what stage a cancer is in and what quality of mutation is considered are valid. In this sense, even though the TCGA will organize the physical files and provide structure, researchers play a role in adding the description and metadata that will determine where the resource is sorted.

OTHER CONSIDERATIONS:

It is likely that certain technical considerations will change over time, such as file formats, the sequencing platforms used, and the types of data available. It will be interesting to see how the site adapts and reorganizes, and what it does with older data, which may not be as high quality. The organizing principles of the collection will also likely parallel research developments. If patient age, for example, is shown to be more significant than histology, this change in research trends would likely be reflected in the organizing system.

Organizing Resources from Online Shopping Websites

OVERVIEW:

Online shopping websites themselves are usually complex organizing system. It would be very challenging to aggregate their resources for the purpose of searching or sorting. I used to work on a light project to leverage several shopping sites and offer top 4-5 best matched instances in one search result snippet. Then a lot of interaction design issues came up. The project grows larger and larger that we ended up building an entire organizing system (http://gouwu.sogou.com/) based on resources from different sites. We were also able to define the description format, and eventually offer a unified API for all partners to updates their resources in real time.

WHAT RESOURCES ARE BEING ORGANIZED?

At stage one, our service was able to integrate 8 different websites: taobao.com (ebay-like C2C site and the only C2C website), tmall.com (amazon-like general B2C website), amazon.cn (local version of amazon.com), 360buy.com (B2C, specialized in electronic devices), dangdang.com (B2C, specialized in books), vancl.com (B2C, specialized in clothes), m18.com (specialized in bags), and yhd.com (B2C, specialized in grocery).

As an aggregator, our system didn’t worry about the physical resources in our partners’ warehouse. All we wanted to organize were digital information about products, such as texts and images. We also didn’t maintain any information about users’ profile or transaction history. It should be pointed out that we spent a lot of time designing the log format: what information to be logged and how to maintain those information. From my experience, the log information is a crucial part of online organizing system and it should be built along with any systems rather than after. Proper analysis of user log could clarify the requirements and verify the system design in any spiral or iterative lifecycle.

WHY ARE THE RESOURCES ORGANIZED?

We would like to build an organizing system to provide convenient and intelligent search function as well as independent ranking service to our end users. The affordance of our system enabled large-scale and cross-system searching and ranking. Therefore, our system targeted at users with clear search goal and need for price comparing.

On the other hand, we would like to recommend good sites to our end users. A big concern of our end user was the credibility of the online shopping websites. The concern sometimes limited their choice in term of price comparison. To address this problem, we carefully selected the sources of the instances in our system and we tried our best effort to make the information timely and accurate.

HOW MUCH ARE THE RESOURCES ORGANIZED?

Scope-wise, an obvious obstacle is the heterogeneity of resources. By nature, some products could be grouped and comparable like camera, cellphone with the same model or serial number, while others could not, like clothes. We accordingly design two user interface templates and eventually unify their interaction process. Meanwhile, different categorization systems from our partners also cost us lots of effort to achieve integration. Apparently there was no standard or ecosystem yet. We therefore define the granularity from scratch and managed to convert their categorization systems.

Lifecycle-wise, we went through several iterations at the first stage. E-commerce was a hot market at that time. Most of the domain-specific shopping sites later expanded their domain and changed their categorization systems to some extent. We had to address this problem and adjusted accordingly, which echoed the characteristics of complex organizing system lifecycle. It took us some time to redefine our goal and the users of our organizing system. Once the goal of our system was defined, the design and implementation could be clearly carried out. We eventually cut some unnecessary features (like total numbers of similar items) and defined an API to get the description format.

WHEN ARE THE RESOURCES ORGANIZED?

As one of our first priorities was to make the information timely and accurate, we discarded the crawling method used in other vertical domain. Instead, we requested the information of the instances to be transferred immediately after published in our partners’ site. However, we still designed a verification process seconds before the instances store in our system. For example, we automatically checked the URLs availability and information accountability to filter spam.

WHO DOES THE ORGANIZING?

The whole organizing process was automatically carried out by our algorithms. We also apply data analysis techniques to monitor the performance of our algorithms. The log we carefully designed at the beginning came into play here. Our product manager would receive daily report and analyze possible issues. We also have real-time warning system in case of unusual situation.

OTHER CONSIDERATIONS

Despite the progress we made, there were still some issues to be solved. The category systems from our partners are evolving and our system should be able to adjust easily. For example, e-books gain more and more popularity, and now the user need is large enough to make it independent from books category.

Composing my digital symphony

Overview

Orchestration is a resource-intensive activity. When composing symphonies, I manage thousands of different musical ideas across hundreds of instruments. To support this challenging process and to simplify my workflow, I leverage computer software known as a digital audio workstation (DAW). This program allows me to organize my musical ideas and listen to them using virtual instruments, which affords greater productivity and musical foresight than composing with only a pen and manuscript paper.

What resources are being organized?

When composing a symphony, the main resources I organize are instrument tracks. A track represents a musical instrument. It is an entity that contains information about the instrument and its score, and it also hosts a virtual instrument that brings the score to life. Instrument scores are sequences of notes, where each note has a position in time, a notation (i.e. solfege, or pitch), a duration, and a velocity (i.e. volume). Organizing instrument tracks entails specifying their display order in the DAW and assigning them colors.

I could choose a more specific entity as my primary organizing resource. As an alternative to organizing by instrument tracks, I can organize by the instrument scores within each movement of my symphony. For a 50-track composition with 3 movements, this would increase my number of resources from 50 instrument tracks to 150 instrument-and-movement-specific instrument scores. I could then assign specific color shades to each movement, or create time markers to quickly navigate to movements I am working on. I typically consider this to be too granular though. With three times more resources to manage, this makes my organizing task more time consuming and generally not worth the effort.

Why are the resources organized?

Music composition is a long and winding creative process that requires as much organization as possible. My symphony can have over fifty tracks, each of which contains dozens of musical ideas. Because tracks are listed sequentially in the DAW, as my symphony grows, it becomes increasingly difficult to keep track of every instrument. Principles for ordering and grouping instruments allow me to save time when interacting with my resources. I also find it important to organize my resources in order to facilitate any eventual collaboration process with other composers.

How much are the resources organized?

Organizing tracks first requires me to define an ontology for my musical instruments.

At the least granular level, my ontology classifies instruments by orchestral family: Woodwind, string, brass, or percussion. Afterwards, within each family, I classify instruments by instrument family. Woodwinds can be subdivided into Flutes and Reeds, and Flutes can again be subdivided into alto flutes, bass flutes, and more. The end result is a hierarchical classification.

After defining my ontology, I can use it to specify the display order of my instruments in the DAW. Instruments that are the closest siblings and that have the most granular intrinsic properties are kept together (e.g. alto flutes and bass flutes) and are ordered according to pitch (e.g. alto flutes are displayed above bass flutes). Afterwards, all instrument tracks that are part of the same orchestral family are assigned the same color. For example, all percussion instrument tracks are highlighted in red, string instruments in blue, brass instruments in yellow, and woodwinds in green. This allows me to quickly identify a broad category of instruments that I wish to work on.

Some of the difficulties of organizing by ontology is that not all instruments have clear-cut families. The famous example is the piano, which is considered to be both a percussion and a string instrument as a result of its key-controlled hammers that hit strings inside the instrument’s body. The problem is compounded as more diverse instruments are introduced into the composition.

An alternative to organizing by instrument ontology is to organize by instrument score. The ontology only takes into account the static intrinsic properties of an instrument track, but this is sometimes less useful than its dynamic extrinsic properties. The instrument score is one of those properties. The piano, which often plays a lead role in the orchestra, is better positioned at the top of the DAW with its own color instead of with other string or percussion instruments.

When are the resource organized?

Organizing typically takes place when a track is first created, but it can change as a track’s role in the music evolves or when its relationship with subsequently-added tracks strengthens. Tracks that contain important motifs – recurring melodies or musical ideas – may have once played only a supporting role in the piece, and now need to be grouped with other lead instruments. Similarly, while adhering to the instrument ontology as much as possible, I attempt to position tracks that are invoked earlier in the music above tracks that are invoked later, but a track’s role in a piece is not always known at the start.

Who does the organizing?

I am usually the sole composer of my digital symphonies, and thus am the only person who organizes instrument tracks. However, it has happened before that I work with other composers, and in those cases my collaborators would also organize.

Other considerations

A final important aspect of organizing includes giving names to my tracks. Assigning my resource an extrinsic static property allows me to identify tracks more quickly in addition to a track’s order within the DAW. Difficulties of naming include uniqueness, where multiple violin tracks for instance need to be differentiated. In these cases, properties of the instrument score are included in the name, such as “Violin Lead” or “Violin Movement I”.

In conclusion, I always organize tracks as much as possible. I find that the time I invest in determining a new track’s placement and color is well worth the time I save when referring to it hundreds of times throughout the life of a project. Generally, organizing my music projects greatly enhances my creative workflow, and allows me to spend less time tracking my existing ideas and more time coming up with new ones.

IP Addressing in the Global Internet

Overview

Most people take for granted that the Internet just works. They connect their computer to the Internet, it gets an IP address, and they’re able to communicate with a computer with a different IP address on the other side of the planet. How did their computer get the correct IP address? How does any computer or router get the correct IP address? How did the routers and other computers on the Internet get their IP addresses? Who decides which computers and which routers get which IP addresses? Why does it matter, after all an IP address is just a string of 0’s and 1’s, aren’t they all the same?

What resources are being organized?

At their simplest an IPv4 address is a 32-bit series of 0’s and 1’s. They are a born-digital resource as they have no canonical physical representation. Their digital canonical representation that we’ve all become familiar with is called the ‘dotted quad’ format and is 4 numbers between 0-255 separated with dots. For example, 169.229.216.200 is the IPv4 address for www.berkeley.edu.

Not all IP addresses are of equivalent classes. There are unicast, multicast, broadcast and experimental IPv4 addresses, and unicast addresses can be either public or private. There are also two different versions of IP addresses currently in use on the Internet, IPv4 and IPv6. We will focus on IPv4 unicast public IP addresses since these are not only the most common, but also the most important. This is roughly the range of IP addresses from 1.0.0.0 to 223.255.255.255 with some breaks in the middle for private IP address space.

Why are the resources organized?

IP addresses can be represented into blocks, or subnetworks, using a prefix and a mask. For example, 169.229.216.0/24 represents all IP addresses in the range of 169.229.216.1 – 169.229.216.255. Internet routers don’t have enough memory to hold routes for every individual IP address on the Internet. So by organizing the Internet into subnetworks based loosely on a hierarchical model, routers are able to determine the correct path for every destination in the network without actually storing every address in their memory. If the organization of IP addresses is not handled properly Internet routers would exhaust their memory space and parts of the Internet would become unreachable.

How much are the resources organized?

Currently there is too much granularity in the global Internet routing table. For a router it takes the same amount of memory to store a subnetwork with 255 IP addresses as it does to store a subnetwork with 65536 addresses. So if our main concern is to minimize memory usage in Internet routers, thereby lowering operator costs and increasing stability, we want as little granularity as possible in the Internet routing table. The problem is that many organizations use non-contiguous IP subnetworks which cannot be aggregated into larger subnetworks. This results in routers having to store many small subnetworks instead of fewer larger subnetworks, which will eventually lead to memory exhaustion in older routers and possible reachability issues. Currently the full Internet routing table is approaching 500,000 routes. Most network engineers expect problems once the routing table grows past 512,000 entries since router memory limitations are always at bit boundaries.

When are the resources organized?

IP addresses are organized once someone configures one on a device or sets up a Dynamic Host Configuration Protocol(DHCP) server. If an organization exhausts their supply of free IP addresses they will have to make a request to their upstream provider or Regional Internet Registry(RIR) for more address space. In the early days of the Internet large blocks of IP addresses were given to organizations, but this lead to many of the addresses in these blocks not being used. We are now reaching a point where we no longer have new addresses to assign to organizations.

Markets are now emerging for organizations to buy and sell IP addresses and the organizations who have held on to large amounts of unused addressing space stand to make significant revenue from selling their unused space. The problem with this is that when these organizations sell their unused IP address space they will break up large allocations into smaller subnetworks, thereby increasing granularity and further accelerating the growth of the Internet routing table.

Who does the organizing?

The Internet Corporation for Assigned Names and Numbers(ICANN) is currently responsible for initial allocation of IP addresses. They allocate /8 blocks of IP addresses to Regional Internet Registries(RIRs) who are then responsible for distributing allocations to organizations that request them. These organizations can then allocate IP addresses to smaller organizations thus forming a loose hierarchy of organizations, where each level lower in the hierarchy receives a subset of the IP address space from the organization above it. ICANN no longer has any /8 blocks of IP addresses left to allocate to RIRs. Once all of the RIRs have exhausted their last allocations from ICANN organizations will have to rely on secondary markets to increase their IP address space.

Other considerations?

The world of IP addressing will soon get a lot more interesting. The introduction of IPv6 as a replacement for IPv4 has been slow in coming and, while gathering momentum, continues at a snail’s pace. As organizations start purchasing IP addresses from one another we should expect increased granularity and decreased stability in the Internet routing infrastructure. Whether or not normal internet users notice will ultimately be determined by how well equipment vendors and engineers expediently address the coming problems.

Assignment 10: Irina Lozhkina

1. Overview I am going to describe a mobile payments domain : http://www.mobilepaymentstoday.com . It is a good collection of the mobile payments related information.

2. What resources are being used. The scope of the site is describing all the related information regarding mobile payments, its transactions, technology, marketing and news. The author and site’s contributors gather information from other sources and represent their point of view on that particular event/news. Information has a high level of granularity. There are two schemes used to classify information. One (faceted) consists of “research centers”: from Bill Payment and Card Brands to the Money Transfer/P2P and POS. Another classification is based on type of representation: features, news, videos and etc. The site also uses links to the blogs outside of the domain.

3. Why are the resources organized. The highly granular structure of the resources allows to easily and efficiently navigate throughout the site. Users who are focusing on specific type of payments can click on appropriate category and read all the news regarding that specific topic. Those users who are looking for general information on mobile payments can just follow the main news feed. This approach enables a better interaction with the users.

In addition, the contributor of the site is a Networld Media Group. Besides mobilepaymentstoday.com domain, they have other domains where they describe other sources from specific industries, providing news, forums, reports and other information. One of the forms of monetization is consulting outside clients on different subjects. If you go down the bottom of the site, you can find “Submit RFI” button. You click on it and will see short questionnaire. Any company or person can submit their request to Networld Media and receive a solution to their requests (obviously for a predetermined payment). That is the reason why Networld Media has its own interpretation of the news, and NOT just a mere collection of the news from other sources, which would not add any credibility.

They are like Linkedin: you can follow great news but up to the point. If you request more information, you will have to pay.

4. How much are the resources organized. The domain is highly specialized and would mostly be used by people interested in mobile payments. If site becomes interdisciplinary and will start overlapping with other industries, the purpose of the domain becomes useless. Thus, it is arbitrary to stay in the scope of the name of the domain.

In most of the cases it is difficult to put a single news into a single category. Usually it overlaps with other categories. Decision about organizing principles depends on the mainstream of “acceptable” categories. Resources are organized by the topics, by videos, by trends – faceted classification approach.

If the potential customer would like to get a consulting from Networld, he has to fill out a standard document with the questions.

5. When are the resources organized. Although, nowadays, information appears online almost in real time (Twitter, Facebook, etc), mobilepaymentstoday.com does not have to stream its news as soon as possible. The goal of the site is to provide reliable and thoughtful feedback. That is why, it may take several hours of making a conclusion about particular event but it allows creation of credibility in the face of the potential clients.

6. Who does the organizing. There is a single editor, Cary Stemle, according to the site. There is not too much of the information that needs to be organized. That is why it seems logical that one person would be enough to make organizing decisions. It always takes less effort to manage a narrow collection.

7. Other considerations In case the site will grow in scope or will add more categories, it will be important to reorganize the site. One of the principles the creators of the site has to consider is keeping content separate from the presentation.

Another consideration is using more tags and metadata. It will become very useful in the future, especially if the site grows in scope. There will be a trade off, however, if the site will start using metadata. It may deviate from the content uniqueness, hence, losing customers who previously relied on Networld’s opinion.

I202 | Information Organization and Retrieval

UC Berkeley Fall 2013 INFO 202

Tag Archives: Fred

HTTP Cookies

Assignment 10

Organizing Digital Tumors

Organizing Resources from Online Shopping Websites

Composing my digital symphony

Overview

What resources are being organized?

Why are the resources organized?

How much are the resources organized?

When are the resource organized?

Who does the organizing?

Other considerations

IP Addressing in the Global Internet

Assignment 10: Irina Lozhkina