Chapter 1 Hardware

In this video we will be talking about hardware. We distinguish hardware from software. The hardware is the physical equipment you will use to collect the audio data, while the software is the program that you’ll run later to extract and analyze data. We will be talking about software in a separate dedicated video.

Although we separate hardware and software because they are different, in the context of long-form recordings, hardware and software cannot be chosen independently from each other because the most commonly used system actually entails both a hardware and a software portion. This is the LENA system.

The LENA Foundation created hardware (a recording device they call DLP, digital language processor) as well as a software system, and they can only be used together. So if you are deciding whether to use LENA or not, then you either go all in or all out. This means that if you use the LENA software you need to use the LENA hardware and vice versa.

When we compare hardware, we will take into account 6 features:

  1. ease of use
  2. cost
  3. length of an audio recording bout
  4. total recording storage capacity
  5. childproof
  6. any special features of the device

Let me explain each.

In ease of use, we consider how light the device is, how easy it is to make it into a wearable, and the simplicity of the user interface. The user interface is an important consideration for those allowing their participants to control when the recorder is on and off, which is important for ethical purposes, as we explain in the Video on IRB – see Chapter 8.

For the cost, we’ll typically talk about how much each individual recorder costs.

As for length of the recording bout and total storage, these are actually not necessarily identical. That is, typically, the length of a continuous or intermittent recording is determined by its battery, and not the data storage or memory limitation. So when we talk about the length of a recording bout, we mean with a single battery charge.

In contrast, total storage is how much data you can have in the recording device before you need to extract the data. So if you are working with families that can recharge the device, you can also ask them to record several days (up to a week, depending on the device). This is determined by the device’s memory limitations.

Also, we talk about devices that are more or less childproof – and by this we mean that it’ll be relatively hard for the child to break it by e.g. gaining access to it and stepping on it, or dropping water onto it. All the devices we talk about are child-friendly, in the sense that they are lightweight.

For each recording device type, we’ll discuss also any specific or special features, such as their recording quality.

1.1 LENA hardware

Let’s start with the most common device used, which is the LENA Foundation’s DLP.

The LENA Foundation’s hardware (like its software) is durable, stable, and simple to use by all participants involved in developmental language studies (researcher, practitioner, parent, and child). At present it costs about 400 US$ a piece and you get a discount if you buy several of them. It allows you to record 24h straight and then the battery runs out. You can also ask the family who is doing the recording to recharge it, which will allow them to record 3 times. When the recording goes up to 72h, the memory is full, and then you need to extract the data before you can use it again. It is perfectly childproof.

Regarding special features, please note that you need to pay for the LENA software separately - they have several licenses, and you can ask for a quote, but just to give you an idea, you probably need about 5000 US$ to get started. One unique feature of LENA is that they have an accurate internal clock, that allows you to more easily know when recording was started and stopped.

1.2 Non-LENA hardware

All other hardware options require you to use a different software – you cannot use the LENA software unless you use their hardware too. So if you choose a non-LENA hardware, you’ll need to make a separate set of decisions for the software. We’ll be discussing non-LENA software options in detail in a dedicated video – see Chapter 5.

In reality, you could use any alternative recording devices as long as it is wearable and lightweight. In fact, even a device that wasn’t designed to be worn can be adapted for this purpose, for instance by using a money or paper clip to attach it to someone’s clothing. (We discuss clothing in a dedicated video – see Chapter 3.)

1.2.1 iPods

Some researchers have adopted iPods, which are as light as the LENA device. One advantage of iPods is that you can use a platform to program them to collect data intermittently, which is ideal if you want to record snippets here and there rather than continuously.

They have several disadvantages. To begin with, they are quite pricey, they are not childproof, and the interface on the actual hardware is not as simple to use as other alternatives we discuss here. Therefore, it may not be the best adapted particularly if the families are going to turn recordings on and off by themselves.

They have more seldom been used among children, and we have no first-hand experience with them in this setting, so we cannot tell you for sure what the length of a recording bout is, nor their total capacity. It is possible that iPods could be programmed to upload recordings to the cloud as they go, but we suspect this will limit recording bouts as this will use a good deal of battery. Recordings are time-stamped, so you can tell when each recording bout started and stopped.

1.2.2 Hand-held recorders

You can also use devices that are frequently used by field linguists and psychologists, such as Olympus hand-held recorders. Note that we don’t mean you are actually going to hold them – we just mean that they are small and sturdy, so they can be adapted as wearables. This is done by using a money or paper clip, or sewing pockets onto t-shirts – we’ll talk about these options in the Clothing video – see Chapter 3.

Some hand-held devices are a quite good option, because they have better audio quality than LENA, they can have higher sampling frequency and a flat frequency response rate, resulting in the highest quality among the devices we are discussing here. So if you want to look at details of the sounds people say, this may be the best choice.

There are a few disadvantages of hand-held recorders. To begin with, the interface is less easy to use by researchers and parents than other options, with menus that need to be navigated. Also, they are not child-proof, so the child may be able to break them or affect the recording (e.g., by stopping it). They can be relatively heavy – although please note that some of the latest models, that do not use batteries, are as light as the LENA. Also, suppose the equipment is stolen after some recording has already happened: personal data will be lost and could fall in the hands of someone who doesn’t use it for the purpose it was supposed to, this is not under your control. Also, since Olympus recorders typically have a USB port, anyone could simply copy the data you are recording and use it for their own purposes.

These options cost between 200/300 $ a piece. The price varies as several models would fit this.

The one we have used had a maximum recording bout of 22 continuous hours. After this point, the recording stopped by itself. There was still some space left, so it is possible that recharging the batteries in the device would have allowed families to record several such bouts. To our knowledge, these recordings are not accurately time-stamped.

1.2.3 USB “spy” recording devices

An option that our team finds very promising is a USB device that has been mass-produced, and can be found in regular outlets (e.g., on Amazon) if you look for “spy audio recording USB.” The ones we used originally are no longer being sold, but you can find a link to a similar one in the resources section of this video. USB devices are much cheaper (10-20US$ a piece) and you can find many different brands. We recommend you find a brand that time-stamps recordings, since families sometimes start and stop the recording (to make use of their choice to withhold data, or because you have asked them not to record when the child is away or asleep).

The USB devices we used have a battery life of 15 hours, and most of the other people we know to have used them report a similar length of recording bouts. One disadvantage of these devices is that sometimes they fail, resulting in recordings that are only 4 hours long, or shorter. We have dealt with this by using two devices, launched at the same time – this way, if one of them fails, you always have a back-up. So each recording bout will be maximally 15 hours in length, but the devices can be recharged by plugging them to a USB charger. This will allow the family to collect several such recordings – for instance for 4 days. The precise storage capacity depends on the brand that you use.

They are very easy to use by researchers and parents because they only have an ON/OFF switch. Given their small size, and the fact that they come with small cap, if you choose this option, we strongly encourage you to consider choking hazards. You can reduce this risk by removing the cap and clipping the USB to the child’s clothing in such a way that it becomes impossible for the child to pull it out. We’ll talk about how you do this in the clothing video – see Chapter 3.

The technical audio quality is comparable to those of the other devices, and although to the ear the sound is less sharp than others’, automated analyses do not show a great deal of differences with other devices. As with other devices, we discussed, there is a security problem: It is very obvious in this case for users that they may be able to plug in the device onto a computer that has a USB port, and copy over the recording data, which constitutes a security hazard.

1.2.4 Babylogger

Our team is currently turning to the BABYLOGGER. The Babylogger is being developed in France, and at present it is still in the piloting stage. We have bought quite a few because we think it is very promising.

In our view, it has a few features that makes it more interesting than the LENA device: in fact, it has been molded on the LENA device with the goal of improving it. While it has about the same size, weight, sturdiness, and ease of use for the families as the LENA, the Babylogger has 4 microphones instead of just 1, and it contains an accelerometer, which is interesting if you want to analyze movement data.

But in our opinion, the very best features of the Babylogger are the length of the recording bout and of the total storage, and the fact that recordings are encrypted. Let me explain this.

The Babylogger allows more extensive recordings. As you recall, the LENA device allows you to record 24h straight, up to 72h if you recharge it. The Babylogger also allows for 24h continuous recordings. But when you recharge it, it will hold a week of recordings. Well, if I’m going to be perfectly accurate, since the babylogger has an external SD card for its storage, you can also buy a card with greater capacity, that allows families to record for even longer.

Regarding encryption, recall that a problem with all other recording devices is one of privacy: external people can copy data and use it as they wish. This is not the case for the Babylogger: as far as we know, this is the only device that has on-device encryption of the data, so if the device is lost, people cannot copy data. When you buy it, you are the only one having the key to unencrypt data for the device.

One downside of the babylogger is that, since it is still in the piloting stage, it’s produced in small amounts, so they come at a very expensive cost, which we estimate at 600€ each (although this depends on how many you buy and how purchases are grouped). Also, we are helping test it, and sometimes we find issues that we report to the team - so it is a device that will change in the future.

1.2.5 Wireless systems

Other researchers have come up with solutions that work well only in the home, such as recording systems coupled with wireless microphones worn by the key child and selected family members. This is actually based on technology that has existed for a very long time – we think Wells 1977’s data, available from CHILDES, gives a good example of the interest of this technique. Fewer people are using this one, so we won’t talk about this option very much in this series.

We have not used these first hand, so we cannot tell you the cost and ease of use for the family. We suspect that if the device controlling the recording can be plugged in, then the length of recording bouts can be quite long. Also, most portable microphones we know about should be quite easy to learn how to use, with a button to turn them on or off. These microphones tend to be way cheaper; nevertheless, they are not necessarily childproof, nor easy to use. Also, you need to set up a bit of equipment in the child’s home.

Such a system is a great solution to a problem that is prevalent across all the others, which are continously worn by the child: When the child leaves the home, they will not be recorded, which poses less privacy problems and issues of incidental recording of people whose informed consent has not been established. We discuss these in the IRB video – see Chapter 8.

Since you can fit several family members with microphones, this will probably result in better quality of audio for each person, and also allows better opting out of the recording. (For instance, if the parents do not want to be tracked at some point, they can turn off their microphones.) This will result in a multi-track recording, like with the babylogger, which has 4 microphones (all in the same device). This may open up in the future interesting technical approaches to separating sources and removing noise from the recordings, although to our knowledge no speech technology team is currently working on this.

1.3 The bottom line regarding hardware

Let me just say that there is no combination of hardware that is perfect and everyone should use, so the choice really depends on your particular setting. I am going to talk about some general purposes that could help you make an initial decision.

For most people, we recommend LENA. In particularl, if you are someone who doesn’t have a lot of technical background and you don’t have a tech support in your institution, you don’t have mathematical knowledge or programming knowledge, you are not particularly interested in very fine details of what happens in the recordings and you don’t have a lot of support to do human annotation, then probably the best choice for you would be a LENA product. Why? Because LENA has been used very widely, in many different labs and your data would be comparable to many other people’s data. Also, it requires very little of you in terms of technical skills, it’s very easy to use, and it comes with a large community of users who could help you in many different ways and you could also cite a number of papers on how technical decisions have been made.

If you have some programming knowledge, and like tinkering with tech, then Babyloggers or the wireless systems are probably your best choices, provided you have the budget for them. These approaches have the most flexibility and richness, allowing you to contribute to the development of software that exploits these capabilities.

If you are doing field work in remote sites, our preference for this are the USBs. All of the equipment suffers somewhat in hot and/or humid conditions, and in the presence of dust and/or salty air – in this setting, having a device where each unit is very cheap really takes the pressure off. They also consume the least electricity and are easiest to charge than all other options. Also, security hazards of someone getting access to the recorded data are not likely if there are few working computers where you are. Please note we have a dedicated video with tips for people working on field conditions – see Chapter 19.

If you are doing large-scale studies, with hundreds of families, we’d also recommend USBs because they are the only option at scale when you have a smallish budget, and because experience shows that you’ll always lose some devices in this setting. That said, you need to put extra care in talking to everyone involved in order to make sure that there is no misappropriation of the data.

If you are doing a longitudinal study with a small number of families who are strong partners in the research, for instance with a minority community or in a multilingual setting, the Babylogger may be ideal because of the on-device encryption and the very large amount of data that can be held. This will allow families to record one day a week for several weeks, without having to interact with you to off-load the data on a computer.

If you are interested in phonetic or acoustic studies and need high quality recordings, then hand-held recorders or a wireless system will be ideal, because you have actual control on the recording quality. In all other cases, you do not: the recording quality is set by the device.

If you are studying social interactions and want to make sure you capture interactions between the child and each and everyone else, then the wireless option is ideal, because no other one will capture in high recording quality with a near microphone everyone around the child. In our previous data, the voice of adult males and other children is much lower in intensity than that of female adults – we are not sure why that happens (perhaps young children tend to stay close to female adult carers), but it does make us worry that the other devices, with their low resolution, may not capture equally well the speech of these other partners.

1.4 Summary

Hardware Ease of use Cost Length of one bout Storage capacity Childproof Pros Cons Best choice for
LENA easy 400US$ 24h 72h yes accurate internal clock / For most people
iPods not easy ? ? ? no collect data intermittently, recordings are time-stamped pricey /
Olympus not easy for parents 200/300US$ 22h ? no better audio quality than LENA, higher sampling frequency and a flat frequency response rate heavy, data could be easily stolen, not accurately time-stamped If you are interested in phonetic or acoustic studies
USB easy 10/20US$ 15h depends on the brand not completely: caps could be choking hazards technical audio quality is comparable to those of the other devices, cheapest device sometimes recording fails, data could be stolen If you are doing field work in remote sites, or large-scale studies
Babylogger easy 600€ 24h a week yes molded on LENA but has 4 microphones instead of 1 and an accelerometer, encrypted recordings, best storage still in piloting stage: expensive and sometimes has issues If you have some programming knowledge, and like tinkering with tech, or if you are doing a longitudinal study with a small number of families
Wireless systems it depends ? quite long ? not necessarily Quite cheap, can be easy to use, causes less privacy problems beacuse it doesn’t record outside the child’s home, has better audio quality you need to set up quite a bit of equipment in the child’s home, If you are interested in phonetic or acoustic studies, or if you are studying social interactions

1.5 Resources