Method and Apparatus for Recording and Replaying Caption Data and Audio Data
TECHNICAL FIELD The present invention relates to a method and apparatus for recording and replaying caption data and audio data, which is used to learn a language.
BACKGROUND ART
A closed caption system is adopted for hearing impaired persons to read caption text corresponding to spoken dialog. A closed captioning standard in the United States is defined by the Federal Communication Commission (FCC). The standard specifies that the closed caption data should be transmitted on line 21 of every odd field of video signal. The closed caption data consist of caption control codes and caption data including information about positions and attributes of the caption data. There are three display modes depending on how the caption data is displayed, Pop-On caption mode, Paint-On caption mode and Roll-Up caption mode. In most Off-Line (post-production) captioning for movies, videos, TV sitcoms, etc. the Pop-On caption mode is used. In On-Line (real-time) captioning for TN news, live TV shows, etc. the Roll-Up caption mode is used. The closed caption system can be also used to learn a language (for example, U.S. Pat. No. 5,572,260 dated Nov. 5, 1996). When the closed caption system is used to learn a language, a user may record closed-captioned program such as movies, TV sitcoms, etc. on a video cassette tape by using a video cassette recorder (VCR) and replay the video cassette tape. However, it is difficult and inconvenient for the user to search caption text and to replay audio signal corresponding to a selected caption text repeatedly. When the user searches the caption text and replays the audio signal corresponding to the selected caption text, the user should rewind the video cassette tape back and forth and replay the tape at an appropriate point.
DISCLOSURE OF INVENTION
The purpose of the present invention is to provide a method and apparatus whereby users can record and replay caption data and audio data containing only spoken dialogs corresponding to the caption data. For recording caption data and audio data, video signals and audio signals are input to the apparatus of the present invention. The input audio signals are converted to digital audio data by an analog-digital converter. The digital audio data is delayed by an input buffer that can accumulate audio data of maximum time dtl . A closed caption decoder extracts and decodes closed caption data from the input video signal. The decoded caption data is sent to a microcomputer. The microcomputer stores the caption data in a memory and transfers the caption data to a display control means that displays the caption data on a monitor. When a caption control code indicating caption display on screen is received and detected by the closed caption decoder, the microcomputer marks the last memory address of the stored caption data block, starts storing audio data in the memory and marks the starting memory address of audio data. By using the input buffer, one can record audio data time dtl earlier than the caption control code indicating caption display is received. After a caption control code indicating caption erasure is received and detected by the closed caption decoder, the microcomputer keeps storing audio data for a predetermined time. At the end of the predetermined time, the microcomputer stops storing the audio data and marks on the last memory address of the stored audio data block. By repeating these processes, the apparatus records the caption data blocks and the audio signals containing only the spoken dialogs corresponding to the caption data blocks. With these caption data blocks and audio data blocks, users can easily scan the caption data blocks and replay the corresponding audio data blocks.
BRIEF DESCRIPTION OF DRAWINGS
Fig 1 is a block diagram of the apparatus for recording and replaying caption data and audio data
Fig 2 is a flowchart showing a method of recording caption data and audio data
Fig 3 is a timing diagram of caption data blocks, audio data blocks and spoken dialogs
Fig 4 is memory maps of the caption data, the audio data and the memory address in the memory Fig 5 is a flowchart showing a method of replaying caption data and audio data
Fig 6 is a flowchart showing a method of replaying caption data and audio data with the audio pause and audio repeat functions
Fig 7 is a flowchart showing processes of the audio pause and audio repeat functions in the ith block in Fig 6
Fig 8 is a flowchart showing a method of scanning caption data blocks
BEST MODE FOR CARRYING OUT THE INVENTION
Caption control codes and their timing that are used in the present invention will be explained In Off-Line captioning for movies, videos, TV sitcoms, etc closed caption data is encoded in Line 21 of video signal in such way that the appearance of caption data on screen is synchronized to spoken dialog corresponding to the caption data The appearance of the caption data on the screen is controlled by caption control codes When a caption control code indicating caption display on screen is received, the caption data is displayed on the screen and when a caption control code indicating caption erasure is received, the displayed caption data disappears on the screen Therefore, the receiving time of the caption control code indicating caption display is approximately starting time of the spoken dialog and the receiving time of the caption control code
indicating caption erasure is approximately ending time of the spoken dialog Hence, one can record only the spoken dialog corresponding to the caption data by using the receiving time of the caption control codes indicating caption display and caption erasure By marking memory addresses of caption data and audio data corresponding to the receiving time of the caption control codes when the caption data and the audio data are recorded, one can replay the caption data and the audio data according to the marked memory addresses
The caption control codes indicating caption display and caption erasure are an End Of Caption (EOC) code and an Erase Display Memory (EDM) code respectively in Pop-On caption mode In the Pop-On caption mode, a Resume Caption Loading (RCL) code indicating the Pop-On caption mode is followed by caption data that includes caption character codes or information about positions and attributes of caption characters The caption data is firstly stored in a non- display memory The caption data block stored in the non-display memory is not displayed on screen When the EOC code is received, the non-display memory and a display memory are swapped and the caption data block in the display memory is displayed on the screen When the EDM code is received, the caption data block displayed on the screen is erased In Paint-On caption mode, a Resume Direct Captioning (RDC) code indicating the Pamt-On caption mode is followed by the caption data The caption data is directly stored in the display memory and displayed on the screen The caption data block displayed on the screen is erased by the EDM code The RDC code and the EDM code are the caption control codes indicating caption display and caption erasure respectively in the Paint-On caption mode In Roll-Up caption mode, caption data following a Resume Roll Up (RU) code is displayed on the screen The RU code and the EDM (or Carnage Return) code are the caption control codes indicating caption display and caption erasure respectively in the Roll-Up caption mode
The method and apparatus for recording and replaying caption data and audio data by using the caption control codes and their timing will be described
in detail
Fig 1 is a block diagram of an apparatus for recording and replaying caption data and audio data according to the present invention An instruction input means 10 provides an instruction selected by a user, such as an instruction for recording, an instruction for replaying, etc , to a microcomputer 1 1 For recording caption data and audio data, video signal outputted from a video cassette recorder (VCR) is inputted to a closed caption decoder 13 through a video input terminal 12 and audio signal outputted from the VCR is inputted to an analog-digital converter (ADC) 15 through an audio input terminal 14 The ADC 15 converts the analog audio signal to digital audio data An input buffer 16 stores the audio data in a so called first-in-first-out way The input buffer 16 can accumulate audio data of maximum time dtl If there is no signal from the microcomputer 1 1 , the input buffer 16 erases the first-in audio data after time dtl The closed caption decoder 13 extracts two byte closed caption data from the inputted video signal and decodes the closed caption data The decoded caption data that includes caption characters or information about position and attribute of caption character is provided to the microcomputer 1 1 via a bus 17 The microcomputer 1 1 stores the caption data in a memory 1 8 and transfers the caption data to a display control means 19 which displays the caption data on a monitor 20 The closed caption decoder 13 also detects caption control codes indicating caption display on screen and caption erasure When the closed caption decoder 13 detects the caption control code indicating caption display, it provides a signal to the microcomputer 1 1 Then, the microcomputer 1 1 stores the last memory address of the stored caption data block in the memory 18, starts storing audio data from the input buffer 16 and stores the starting memory address of audio data block in the memory 1 8 When the closed caption decoder 13 detects the caption control code indicating caption erasure, it provides a signal to the microcomputer 1 1 and the microcomputer 1 1 stores memory address of audio data corresponding to the receiving time of the caption control code in the
memory 18. Since the input buffer 16 delays audio data by time dtl , the memory address of audio data corresponding to the receiving time of the caption control code is obtained by adding total memory addresses of audio data corresponding to time dtl to the memory address of audio data stored in the memory 18 at the receiving time of the caption control code. If the closed caption decoder 1 3 does not receive the next caption control code indicating caption display within predetermined time dt2, the microcomputer 1 1 stops storing audio data after time dt2, that is, after storing more audio data corresponding to time dt2. If the closed caption decoder 13 extracts and decodes closed caption data within time dt2, it provides the caption data to the microcomputer 1 1 . Then, the microcomputer 1 1 stores the caption data in the memory 18 and transfers the caption data to the display control means 19 while it keeps storing audio data. If the closed caption decoder 13 detects the caption control code indicating caption display within time dt2, it provides a signal to the microcomputer 1 1. Then, the microcomputer 1 1 stores the last memory address of the next caption data block and memory address of audio data stored in the memory 1 8 while it keeps storing audio data. This memory address of audio data is the starting memory address of the next audio data block. These processes are repeated to record caption data blocks and the audio data blocks containing only spoken dialogs corresponding to the caption data blocks.
When the instruction for replaying caption data and audio data is made on the instruction input means 10, the microcomputer 1 1 transfers audio data from the memory 18 to an output buffer 21 and caption data block from the memory 1 8 to the display control means 19. A digital-analog converter (DAC) 22 converts audio data to analog audio signal that is turned into sound by a speaker 23. The display control means 19 displays the caption data block on the monitor 20. The display control means 19 can display several caption data blocks in various ways. As an example, the display control means 19 starts displaying the caption data block from the top row of the monitor 20. When the next caption data block is
received, the display control means 19 displays the next caption data block below the previous caption data block If there is no space for the next caption data block after several caption data blocks are displayed, the display control means 19 scrolls up the displayed caption data blocks and displays the next caption data block on the bottom of the monitor 20 When memory address of audio data transferred to the output buffer 21 is the memory address of audio data corresponding to the receiving time of the caption control code indicating caption erasure, the microcomputer 1 1 transfers the next caption data block to the display control means 19 These processes are repeated to replay the caption data and the audio data
Fig 2 is a flowchart showing the method of recording caption data and audio data in the Pop-On caption mode In step S I , the closed caption decoder 13 extracts and decodes two byte closed caption data from inputted video signal If the closed caption data is not the End Of Caption (EOC) code indicating caption display, the closed caption decoder 13 provides the decoded caption data to the microcomputer 1 1 (step S2) In step S3, the microcomputer 1 1 stores the caption data in the memory 1 8 and transfers the caption data to the display control means 19 The display control means 19 displays the caption data on the monitor 20 After step S3, the process returns to step S I These processes are repeated until the EOC code is received and detected by the closed caption decoder 13 If the EOC code is received and detected in step S I , the closed caption decoder 13 provides a signal to the microcomputer 1 1 in step S2 Then, the microcomputer 1 1 stores the last memory address, LADDR[C] of the stored caption data block in the memory 18 in step S4 In step S5, the microcomputer 1 1 starts storing audio data from the input buffer 16 and stores the starting memory address, SADDR [A] of audio data block in the memory 18 The memory addresses, LADDRfC] and SADDR[A] represent the receiving time of the EOC code The input buffer 16 can accumulate the audio data of maximum time dtl If there is no signal from the microcomputer 1 1 , the input buffer 21 erases the first-in audio
data after time dtl . When the closed caption decoder 13 detects the EOC code at time t, the microcomputer 1 1 stores audio data from time t-dt l before the EOC code is received since the input buffer 16 delays the audio data by time dtl . Therefore, memory address of audio data corresponding to the receiving time of the EOC code is SADDR[A]+B, where B is total memory addresses of audio data corresponding to time dtl . Time dtl may be about one second in the Off-Line Captioning. In this way, one can avoid cutting off the beginning portion of spoken dialog corresponding to the caption data block. In step S6, the closed caption decoder 13 extracts and decodes closed caption data if closed caption data is received. If the closed caption data is not the Erase Display Memory (EDM) code indicating caption erasure, the closed caption decoder 13 provides the caption data to the microcomputer 1 1 in step S7. In step S8, the microcomputer 1 1 stores the caption data in the memory 1 8 and transfers the caption data to the display control means 19 while the microcomputer 1 1 keeps storing audio data. After step S8, the process returns to step S6. When the EDM code is received and detected in step S6, the closed caption decoder 13 provides a signal to the microcomputer 1 1 in step S7 and the microcomputer 1 1 stores memory address of audio data corresponding to the receiving time of the EDM code in the memory 18 (step S9). If the memory address of audio data stored in the memory 18 at the receiving time of the EDM code is EADDR[A], the memory address of audio data corresponding to the receiving time of the EDM code is obtained by adding total memory addresses of audio data corresponding to delay time dtl , B to EADDR[A], that is EADDR[A]+B. Even though the EDM code is received, the display control means 19 keeps displaying the caption data block on the monitor 20 in this embodiment. In step S10, the microcomputer 1 1 checks whether memory address of audio data stored in the memory 18 passes memory address of audio data, EADDR[A]+B+D or not, where D is total memory addresses of audio data corresponding to predetermined time dt2. If no closed caption data is received (step S l l ) until memory address of the
stored audio data reaches EADDR[A]+B+D, the process proceeds to step S 14 in which the microcomputer 1 1 stops storing audio data More audio data corresponding to time dt2 is recorded to avoid cutting off the ending portion of spoken dialog corresponding to the caption data block After step S I 4, the process returns to step SI If closed caption data is received and decoded by the closed caption decoder 13 (step S l l ) before memory address of the stored audio data reaches EADDR[A]+B+D and if the closed caption data is not the EOC code, the closed caption decoder 13 provides the caption data to the microcomputer 1 1 in step S12 Then, the microcomputer 1 1 stores the caption data in the memory 18 and transfers the caption data to the display control means 19 in step S I 3 After step S 13, the process returns to step S 10 When the EOC code is received and detected by the closed caption decoder 13 in step S l l and S 12 before memory address of the stored audio data reaches EADDR[A]+B+D, the process returns to step S4 In this way, one can record the caption data blocks and the audio data blocks containing only spoken dialogs corresponding to the caption data blocks The last memory addresses of the caption data blocks, the starting memory addresses of the audio data blocks and the memory addresses of audio data corresponding to the receiving time of the EDM code will be used to replay the stored caption data and audio data Fig 3 shows a timing diagram of caption data blocks, audio data blocks and spoken dialogs in the Pop-On caption mode At time tl , the EOC code is received and caption data block C l is displayed on screen At time t2, the EDM code is received Since spoken dialogs and caption data blocks are almost synchronized in the Off-Line Captioning, spoken dialog D l corresponding to caption data block Cl starts at around tl and ends at around t2 In this embodiment, the caption data block is not erased by the EDM code Similarly, the EOC code is received and caption data block C2 is displayed at time t3 and the EDM code is received at time t4 Spoken dialog D2 corresponding to C2 starts at around t3 and ends at around t4 The EOC code is received and caption
data block C3 is displayed at time t5 and the EDM code is received at time t6. Spoken dialog D3 corresponding to C3 starts at around t5 and ends at around t6. Audio data block Al recorded from tl -dtl to t2+dt2 covers spoken dialog Dl . Audio data block A2 recorded from t3-dt l to t4+dt2 covers spoken dialog D2 and audio data block A3 recorded from t5-dtl to t6+dt2 covers spoken dialog D3. In this example, the time interval between t2 and t3 is greater than dtl +dt2 and the time interval between t4 and t5 is less than dtl and dt2.
Fig 4 shows the memory maps of the caption data, the audio data and the memory address data shown in Fig. 3. The last memory address of caption data block C l, LADDR[C 1 ] and the starting memory address of audio data block A l, SADDR[A1 ] are stored in the address memory at the receiving time tl of the EOC code. The memory address of audio data stored in the audio memory at the receiving time t2 of the EDM code is EADDRfAl ]. By adding total memory addresses B of audio data corresponding to the input buffer's delay time dtl to memory address EADDR[A1], the memory address EADDR[A1 ]+B of audio data corresponding to the receiving time t2 of the EDM code is obtained and stored in the address memory at t2. By adding total memory addresses D of audio data corresponding to the predetermined time dt2 to memory address EADDRfAl ]+B, the ending memory address EADDR[A 1 ]+B+D of audio data block Al is obtained. Similarly, the last memory address LADDR[C2] of caption data block C2 and the starting memory address SADDR[A2] of audio data block A2 are stored in the address memory at t3 and the memory address EADDR[A2]+B of audio data corresponding to the receiving time t4 of the EDM code is stored in the address memory at t4. The ending memory address of audio data block A2 is EADDR[A2]+B+D. The last memory address LADDR[C3] of caption data block C3 and the starting memory address of audio data block A3 are stored in the address memory at t5 and the memory address EADDR[A3]+B of audio data corresponding to the receiving time t6 of the EDM code is stored in the address memory at t6. The ending memory address of audio data block A3 is
EADDR[A3]+B+D
Fig 5 is a flowchart showing a method of replaying caption data and audio data A user can start replaying caption data and audio data by selecting a replay instruction on the instruction input means 10 In step P I , the microcomputer 1 1 reads out the starting address of the first audio data block A l from the memory 18 and transfers audio data from the memory 1 8 to the output buffer 21 The DAC 22 converts the audio data to analog audio signal and the analog signal is turned into sound by the speaker 23 In step P21 , the microcomputer 1 1 reads out the last memory address, LADDRfCl ] of the first caption data block Cl from the memory 18 In step P31 , the microcomputer 1 1 transfers caption data block Cl from the memory 18 to the display control means
19 The display control means 19 displays the caption data block on the monitor
20 In step P41, the microcomputer 1 1 reads out the memory address EADDRfAl ]+B of audio data corresponding to the receiving time of the EDM code from the memory 18 In step P5 1 , the microcomputer 1 1 compares memory address of the transferred audio data with memory address EADDRfAl ]+B If memory address of the transferred audio data is less than EADDRfAl ]+B and if the stop instruction is made on the instruction input means 10 in step P61 , the process ends If memory address of the transferred audio data is equal to or greater than EADDRfAl ]+B in step P51 , the process proceeds to step P22 in which the microcomputer 1 1 reads out the last memory address of the second caption data block These processes for the first caption data block C l and the first audio data block Al are repeated for the second caption data block C2 and the second audio data block A2 These processes are repeated for all the caption data blocks and the audio data blocks, that is, from the first block to the last nth block in Fig 5 If memory address of the transferred audio data is equal to or greater than EADDRf An]+B in the last nth block, the process proceeds to step P7 in which the microcomputer 1 1 keeps transferring the audio data to the output buffer 21 until the last audio data is transferred After the last audio data is
transferred, the process ends
Fig 6 is a flowchart showing a method of replaying caption data and audio data with audio pause and audio repeat functions Processes for the audio pause and audio repeat functions are added to the flowchart shown in Fig 5 Fig 7 is a flowchart showing processes of the audio pause and audio repeat functions in the ith block of Fig 6 If the audio pause instruction is made on the instruction input means 10 (step P7ι) when the ith caption data block Ci and the ith audio data block Ai are being replayed, the microcomputer 1 1 stops transferring audio data to the output buffer 21 in step P8ι When an audio resume instruction is made on the instruction input means 10 (step P9ι), the microcomputer 1 1 starts transferring audio data to the output buffer 21 again in step Pl Oi and the process proceeds to step P l l i If the audio repeat instruction is made on the instruction input means 10 (step PI li), the microcomputer 1 1 reads out the starting memory address SADDR[Aι-l] of the (ι-l )th audio data block from the memory 18 (step P12ι) In step P13ι, the microcomputer 1 1 replays the (ι- l )th audio data block by transferring audio data from memory address SADDR[Aι-l ] to the output buffer 21 After step P13ι, the process returns to step P4ι These processes are repeated from the second block to the last nth block in Fig 6
Fig 8 is a flowchart showing a method of scanning caption data blocks and replaying audio data block corresponding to a selected caption data block When an instruction for scanning caption data blocks is made on the instruction input means 10, the microcomputer 11 reads out the last memory address LADDRfC l ] of the first caption data block C l from the memory 18 in step Ti l In step T21 , the microcomputer 1 1 transfers the first caption data block Cl to the display control means 19 and the display control means displays the first caption data block C l on the monitor 20 If a next caption instruction is made on the instruction input means 10 (step T31 ), the process proceeds to step T12 for the second caption data block C2 If an audio replay instruction is made on the instruction input means 10 (step T41 ), the microcomputer 1 1 reads out the
starting memory address SADDRfA l ] of the first audio data block and the memory address EADDRfAl ]+B of audio data corresponding to the receiving time of the EDM code from the memory 1 8 in step T51 In step T61 , the first audio data block A l from memory address SADDRfAl ] to EADDRfAl ]+B+D is transferred to the output buffer 21 The audio data is converted to analog audio signal by the DAC 22 and the analog audio signal is turned into sound by the speak 23 After step T61 , the process proceeds to step T71 If a previous caption instruction is made on the instruction input means 10 (step T71 ), "No Previous Caption" is displayed on the monitor 20 (step T81 ) and the process proceeds to step T91 If a stop instruction is made on the instruction input means 10 (step T91 ), the process ends These processes for the first caption data block are repeated for the second caption data block except for step T82 If the previous caption instruction is made in step T72, the process returns to step Ti l of the first caption data block in step T82 The processes for the second caption data block are repeated for the rest of the caption data blocks If the next caption instruction is made in step T3n of the last nth caption data block, "No Next Caption" is displayed on the monitor 20 in step T10 and the process returns to step T3n
In the above embodiment, the EOC code and the EDM code are used for recording closed captions and audio signals in the Pop-On caption mode However, other caption control codes can be used to get the starting and ending time of spoken dialogue in the Paint-On or the Roll-Up caption mode In the Paint-On caption mode, the Resume Direct Captioning (RDC) code and the EDM code can be used and in the Roll-Up caption mode, the Resume Roll Up (RU) code and the EDM(or CR) code can be used
As described above, the method and apparatus for recording and replaying caption data and audio data can record and replay the caption data blocks and the audio data blocks containing only spoken dialogs corresponding to the caption data blocks With the apparatus, a user can scan caption data blocks, select a
caption data block and replay the audio data block corresponding to the selected caption data block. Hence, this apparatus helps the user learn a language by both reading the caption data blocks and hearing the spoken dialogs corresponding to the caption data blocks repeatedly.