Tag Archives: Ryan Baker

For the Love of Knowledge

  • Overview

As part of a small crew I was in pursuit of making a documentary film shedding light on the problems in the higher education system in India. We had travelled far and wide, captured many a thought provoking stories, illuminating interviews and shocking truths. Due to the relatively small crew and a tight schedule we ended up with our raw footage being labelled in a generic format (MVI_1234 etc.). I being the director had the task of assisting the editor in renaming and reorganizing the files to make our lives easier and to do justice to all the efforts that were put into capturing all the clips and incorporate them in an impactful manner.

  • What resources are being used?

The primary resources being organized were the video clips (digital, shot on DSLRs) acquired during the shoot. In this context they could be classified as passive resources having no real capability to produce any significant value on their own, and which had to be acted upon or interacted with to produce any effect. But the key problem here was to formulate usable resource descriptions based on the following resource properties:

Intrinsic static: Date and time of creation, duration of the clip, type of external lighting used, camera used, lens used, exposure, ISO, white balance, frame rate, compression type

Extrinsic static: Shot sequence number (assigned to each story element during story boarding), shot movement type (dolly, follow focus, zoom, macro etc.)

During this particular stage the intrinsic and extrinsic dynamic properties did not play a large role in the resource descriptions.

We had done a lot of work on story boarding and identified the right level of granularity so that we could capture each shot sequence separately, hence we directly used the shot sequence number as an important part of the resource description. This helped us keeping our descriptions short and meaningful.

Additionally, we realized that the corresponding audio clips captured along with the video also had to be organized, but since the two were intricately linked to each other we decided to use the same name as the corresponding video clip. The only difference being the extension. We relied on the editing software to capture the intrinsic static properties of the audio files like bit rate and compression type.

 

  • Why are the resources organized?

Essentially, we were organizing these digital resources to find, identify and select them so as to weave a powerful narrative enabling us to convey the truth in an impactful manner.

Hence the interactions were directly with the primary resource.

The interactions that had to be supported by our organization scheme involved

–          Finding the clips related to a particular story board section
–          Selecting the best set of clips to be included in the film based on relevance to story, progression, continuation and several other inter-connected factors
–          Manipulating the clip (i.e. color correcting, white balancing and stabilizing) a clip to create an aesthetic effect
–          Matching the video of a clip to corresponding audio recording
–          Adding the right background score based on sentiment being portrayed in the clips and the progression of the story
–          Providing subtitles in case of a foreign language or incoherent speech

 

  • How much are the resources organized?

Since the scope and size of our organizing system was relatively limited and all the resources were already available, we were able to make some bold decisions without causing a lot of problems. We formed a controlled vertical vocabulary for resource description by deliberately choosing certain resource properties over others. Our main objective was to keep the description as short as possible and at the same time convey the most valuable information that would help us interact with the resources i.e. the video clips.

We could have easily opted for a date and time stamp based id and every resource in a collection (i.e. clips specific to one camera) would have a unique id, but we realized that our cameras have already attached this information to the file along with the technical details like frame rate, aperture, shutter speed, ISO and white balance which our operating system and editing software could easily capture, display and search through, hence we decided not to use these details.

We also decided not to include important properties like lighting conditions (kino-flo, LeikoLite etc.) and location, because the first frame in most of our clips consisted of the clap-board which contained all of this information and our editing software showed all the video files as thumbnails using first frame of the video.

Thus we leveraged all of these to form a controlled vocabulary which placed the shot sequence number first, followed by the take number followed by camera id (i.e. camA, camB etc.)
For instance: 2A_1_camB

However, we did realize that these decisions were specific to our OS and video editing software and hence lacked interoperability.

 

  • When are the resources organized?

In our case, although we intended to organize the resources as soon as they were acquired, we failed and then came up with an organizing system after all the resources were acquired. We leveraged this fact to our benefit and formed a more specific description system.

 

  • Who does the organizing?

Ideally it is the role of the first assistant cinematographer (AC), even 2nd or 3rd AC (depending on the budget) to make sure all the file names are properly stored and all the cards properly backed up. But due to our limitations we i.e. the director and the cinematographer both collaborated to organize the set of raw footage.

 

  • Other considerations

One important consideration that we left out in the discussion was the need for certain people appearing in the documentary to have their identity hidden by means of blurring the face and voice modulation. Although we could not accommodate this interaction of identifying which clips had footage of people who did not want to reveal themselves, we could easily add the special effects over an entire sequence once all the clips were brought together.

Women’s Wallets

Women’s Wallets

  • Overview (1 pt)

Even with the advancement of technology and increased use of digital currencies and digital wallets, the use of physical wallets is still a predominant way to store money, forms of identification, and other money formats (such as credit cards).  Even though resources can be stored in digital wallets, I use my physical wallet on a day-to-day basis to take it  to school with me in my backpack or for shopping/leisure purposes where I place it in my purse. This organizing system is a physical organizing system and so its size is a determining factors on the amount and type of physical resources that can be placed in there.  The number of resources that can be placed into the wallet is determined by the number of compartments the wallet contains and the number of compartments that have a special purpose whether it to hold currency or personal identification. As for the scope of women’s wallets, specifically the wallet I carry around, the wallet holds resources that can be grouped based on their value they have which is either financial or personal identification.

  • What resources are being used? (2 pts)

Resources that fit this domain are monitored by their physical properties such as size and weight because they must fit into the wallet.  The main categories of resources that belong in this organizing system are categories that have financial value and have personal identification value to the user of the wallet. Other than these main categories that can have resources classified to be in the wallet, resources that meet the physical property of being able to fit into the compartments of the wallet can belong in this organizing system. Examples include receipts and business cards that can be inserted in the wallet. As a wallet is defined to be “a pocket-sized, flat, folding holder for money and plastic cards,” we see that the resources have to have  a certain function such as currency value(money) and physical property of being plastic (credit cards, personal identification, gift cards, etc). In the category of personal identification falls not only my personal identification cards but also personal identification cards of others in the form of business cards where I place in a compartment specifically for business cards. As a result these business cards share the property of being a physical resource that contains information where I can learn about the individual digitally such as their website. Thus the resource descriptions of some of the business cards contain digital information about the individual.

 

The resources that are in this organizing system are determined by the user of the wallet,which in this instance is myself. I have control of how long the resource can stay in the wallet and deciding factors include the resource description found under the label called “expires” found on personal identification cards, credit cards, and some giftcards, which is a extrinsic static property. If the date I am using the wallet is past the expiration date on these cards, the lifecycle of these resources belonging in this organizing system has ended and I decide to take away these resources. As a result, the resources existence in this organizing system depend on how long they are valid for use.

  • Why are the resources organized? (2 pts)

The resources are organizes based on the purpose they serve for the user and the physical properties that they entail. They are organized so there can be differentiation between their purposes. For example, the personal identification cards are placed in a different location than where the physical currency (money) is placed. These resources are organized based on the value creation they have for the users. Cash currency is placed in one compartment of the wallet that differs to where credit cards are stored. The purpose of this is to distinguish between currencies that are liquid and those that are not. The wallet is divided into compartments that demonstrate that are intrinsic and static such that the compartments will not change size and always be in the same position as long as the wallet exists. Maintenance may be needed from time to time to ensure the placement of the compartments is functional and determining which resources are valid based on their expiration date. The resources are also organized based on the user’s ability to access the resource when they need it and improve ease of use of the wallet. They are also organized based on the interactions the user will have with the resource such as if the user needs cash, it will be in one compartment.

  • How much are the resources organized? (2 pts)

The resources are organized using faceted classification principles in which all cards go on one side and currency (cash format) goes in other compartments.  Format of the currency affects where the coins go versus where the bills go and demonstrates the implementation of the abstraction hierarchy of works where there is differentiation of money in terms of the format that it comes in. Using faceted classification, some cards can be used for both personal identification and financial purposes such as my Bank of America debit cards that has my identification on it as well. Thus this card can be placed solely with my persona ID cards on one side or on the side next to this with the credit/financial cards. Physical properties such as size also affect if the resource can be organized into the wallet.

 

  • When are the resources organized? (1 pt)

Resources are organized based on the time that the user needs to place or take out a resource and for maintenance purposes. They  are also organized based on their lifecycle such that if the resource is expired then it will be taken out of the wallet. They are also organized when the user has an interaction with the system, whether it is for access, retrieval, or selection purposes. If the user has a planned and systematic way of maintenance, this will be the time that organization occurs.

  • Who does the organizing? (1 pt)

Wallets tend to be owned by one person. The owner, which in this case is myself does the organizing of the resources. I am responsible for selecting which resource belong in this system and maintaining the system such that the wallet does not get overfilled with resources.

  • Other considerations (1 pt)

The wallet can also have large dependence on which other system it always is matched with such as a pocket or purse. By knowing which  other system it has a close interaction with, this also affects the type of resources that can belong in the wallet because some may be better suited to be placed in the companion depending on the user’s interactions with that.  This affects the type of interactions the user will have with the wallet in terms of access and retrieval of certain resources such as money that can also be placed in the purse.

 

China Data Initiative

OVERVIEW 

China’s rapid development has an enormous impact on the rest of the planet. But despite the large amount of information collected by various government and research agencies, data on trends in China is still hard to find and interpret due to the huge silos created by various organizations and poor implementation of database/visualization technology.

The China Data Initiative (CDI) is designed to support existing information platforms focused on China by connecting their databases with with one another to reduce development cost and improve interdisciplinary understanding of environmental, health, and economic policy. China Data Initiative also works on creating and publishing new mashups of existing data sets that do not yet exist with organizational partners. CDI is finally a forum for users to review the quality of datasets and annotate.

CDI will work with various universities, government agencies, and research centers to aggregate data regarding China into a unified dataset. The project involves creating robust but easy use database system that can flexibility deal with large amounts of files ranging from high resolution satellite photos, journal articles, and large databases. Negotiation with various organizations who have different views on how data is shared and interpreted is a critical aspect of this initiative

WHAT RESOURCES ARE BEING USED? 

The resources being used are statistical databases, spatial/photographic databases, and journal/articles databases, but organizing them cohesively overlay together is the big challenge. Given the scale of data already available and the different needs of other nations, the CDI platform only focuses on China as a global effort will be too complex.

All files on CDI should be properly categorized so that search and data comparison on the visualization layer can be done in a seamless way. For example, statistics should easily shown on the map of various power planets and the increases in pollutions in pollution over time via satellite imagery, and then followed by details like research on the effects of pollution on human health from a medical journal. In this respect you are using three mediums, to cover the issues of energy, pollution, and health in one single interdisciplinary platform.

CDI is essentially a large open source management system. It might be easier to consider using separate but related organizing systems for each of the media types, but these fundamentally do not offer the benefits of an integrated data management system that is built to deal with all three varieties of datasets.  The opportunity cost of finding the exact app for each type of dataset is also too high. While some apps might offer high degree functionality and usability, the long expected lifetime of CDI’s requires a flexible and elegant open source solution that will allow users upload or export resources and resource descriptions with ease.

WHY ARE THE RESOURCES ORGANIZED? 

The goal of organizing large amounts of statistical, spatial, and explanatory data is to minimize search cost and, improve interdisciplinary understanding of complex interdependent issues, and help maintain data integrity/transparency.

The three major target key audiences for this venture are the general public, research institutions, and government agencies who lack resources to aggregate all this data or have the technology to extract the most meaning from it.

The system will have datasets that are rich with metadata in order to increase accuracy which currently does not exist given the disparate data standard collection standards in the many different disciplines that will populate the platform. Datasets themselves will have this layer to ensure accuracy, a major problem with research in China.

These decisions determine requirements for the interactions to organizing system must support, but the repertoire of interactions is mostly determined by the choice of storage , visualization, and sharing application. The platform will be completely based in the cloud, but significant work must done to identify platforms that works well in the Great Firewall. Functionality must win out since complexity that would overwhelm your less technology-savvy users, which are the majority.

HOW MUCH ARE THE RESOURCES ORGANIZED?

China Data Initiative will organize data in three major functional areas. Raw datasets from verified sources, spatial/photographic data, and research reports will be organized since they represent the core fundamental forms of data representation. The each topic will have key words tagging vocabulary that will enable activation and combination with other relevant visualizations or data sets. Other users will be able to add to tagging to improve the interdisciplinary understanding of the topic.

A carefully designed set of categories and a controlled tagging vocabulary will enable precise browsing, search, and analysis. The users community supports grouping and tagging of data sets. But not everyone should be allowed to group or tag datasets. Unaccredited members can view the dataset but not add to it. Annotation will be allowed so that verified professionals can add additional input or address mistakes. All the data will again be stored in the cloud, making sure small files do not get lost in a random server somewhere.

WHEN ARE THE RESOURCES ORGANIZED?

Organizations and individual experts are required to use categories or tags existing resource descriptions which should be done whenever they are published on the platform. A quarterly review cycle by the expert committee from leading organizations will be used to review anomalies in conjunction with algorithms that can help detect bad or fraudulent data.

WHO DOES THE ORGANIZING?

A consortium different non-profit and academic organizations focused on China takes on the role of being the editors and curators.  A single organization lacks expertise on all these matters, but a group of organizations along with independent community can help reinforce standards of quality.

OTHER CONSIDERATIONS

Maintenance of this collection for an indefinite time can be achieved as if it it becomes the de facto standard for data aggregation for China. Alternatively having a major funder back the platform would work, but the budget must be spent in a conservative fashion as it is internally not feasible to build teams to work on these data challenges by themselves alone, though an government and partners affairs department will be crucial.