Stop SOPA/PIPA: Blackout Day

January 18th, 2012

A number of sites are blacking out today in protest of SOPA/PIPA. Another large number are adding banners or making other obvious visual changes. I decided not to bother on this site, because my readership is so low (zero, I think).

I liked this video so I figured I’d throw it in:

Some sites that are blacking out today:

Rep. Lamar Smith is claiming that this is just a “publicity stunt” on Wikipedia’s part. Really? A publicity stunt? How does Wikipedia have such a low profile that they need publicity? The MPAA is calling Wikipedia out for being “irresponsible”. My favorite comment on this story (and there are a lot!) points out MPAA’s hypocrisy very simply:

Taking down the websites of others = Not abuse
Taking down your own website = Abuse

Stop American Censorship points you to a number of other places like SOPA Strike and Fight for the Future and SOPA Countdown.

And more. I may update this as I find more and as time permits, but I do have to get some work done. Still, look around a bit and you’ll find tons of examples of people protesting this bill.

Lamar Smith and the other proponents are, of course, trying to play it down and pretend it’s just a few malcontents and losers, and that anybody who opposes the bill are just dirty pirates, rather than people with legitimate concerns about our freedoms.

Help Stop SOPA/PIPA

January 11th, 2012

Today we’re going to take a break from my usual mind-numbing war stories and take a political stance. I prefer to leave politics out of this blog, but these acts promise to destroy my blogs, many other blogs, and uncountable other web sites. Its purported purpose: to help protect artists’ work and incomes. The actual purpose: to enrich a few legacy middle-men.

It probably doesn’t matter anyway; nobody is reading this blog except spammers’ ‘bots, and maybe a friend or two.

I could pontificate for hours on this subject, if I had the energy. I’ll try to keep it succinct.

  • The proponents claim this is aimed at massively infringing foreign sites, but the language in the acts is very vague. All attempts to change the language to narrowly target massively infringing foreign sites have been met with overt hostility and stonewalling. This, if nothing else, tells me that the language is vague deliberately in order to specifically allow abuse after it’s passed.
  • The DMCA takedown provisions and other laws are being abused now. If you think SOPA/PIPA won’t be, I’m sorry, but you’re living in a dream world.
  • These laws are more about killing off competition than preventing copyright infringement, despite what claims are being made.
  • This will not promote innovation, which is why Copyright and Patent laws were created in the first place. This will stifle it. What we need in this economy is more innovation, not less. We also don’t need companies shut down because their only other choice is to go broke defending themselves in court, or to spend all their profits fighting court cases instead of, say, building products and selling them. And using the profits to grow, and hiring people.
  • SOPA/PIPA won’t stop infringement. It may encourage it, as people realize that almost anything they do will infringe, and they lose all respect for IP laws.
  • These laws are being drafted and promoted by politicians who not only are completely ignorant of what the Internet really is and how it works, but they are proud of their ignorance. What’s more, they are ignoring all the voices raised against these laws and claiming that nobody is protesting. Well, nobody important.
  • Many, many clever and intelligent people are strongly opposed to these laws, including some who would personally benefit. Including a few surprises. The fact that so many people with so many diverse points of view are protesting these laws, while only a few support it, should say something in itself.

There’s more, but it’s been discussed many times in many places better than I can express it in a brief blog post. I just wanted to speak up, and point you to a resource or two. First I’m going to post a copy of Jane Wells’ excellent post on this subject. Jane is the UX lead for Wordpress. I’m probably violating copyright by doing this but I doubt Jane will mind; this will help get the word out, and I’m giving proper attribution and pointing back to the original. (And Jane, if you see this and object, please just let me know. I’ll remove the copy and just leave a link.)

I was in two minds about making this post until I saw Jane’s. But since I use Wordpress for free, I figured that somehow I owe it to them to lend active support to their position. (That plus the fact that I agree with them.) So without further ado:

You are an agent of change. Has anyone ever told you that? Well, I just did, and I meant it.

Normally we stay away from from politics here at the official WordPress project — having users from all over the globe that span the political spectrum is evidence that we are doing our job and democratizing publishing, and we don’t want to alienate any of our users no matter how much some of us may disagree with some of them personally. Today, I’m breaking our no-politics rule, because there’s something going on in U.S. politics right now that we need to make sure you know about and understand, because it affects us all.

Using WordPress to blog, to publish, to communicate things online that once upon a time would have been relegated to an unread private journal (or simply remained unspoken, uncreated, unshared) makes you a part of one of the biggest changes in modern history: the democratization of publishing and the independent web. Every time you click Publish, you are a part of that change, whether you are posting canny political insight or a cat that makes you LOL. How would you feel if the web stopped being so free and independent? I’m concerned freaked right the heck out about the bills that threaten to do this, and as a participant in one of the biggest changes in modern history, you should be, too.

You may have heard people talking/blogging/twittering about SOPA — the Stop Online Piracy Act. The recent SOPA-related boycott of GoDaddy was all over the news, with many people expressing their outrage over the possibilities of SOPA, but when I ask people about SOPA and its sister bill in the Senate, PIPA (Protect IP Act), many don’t really know what the bills propose, or what we stand to lose. If you are not freaked out by SOPA/PIPA, please: for the next four minutes, instead of checking Facebook statuses, seeing who mentioned you on Twitter, or watching the latest episode of Sherlock*, watch this video (by Fight for the Future).

Original post here, and please direct your replies there, as she has a lot more visibility than I. Or you can follow Mike Masnick’s excellent Techdirt blog, or read more at Torrentfreak. Or any of dozens of other sites.

Just to make a point: this isn’t about allowing infringement. This is about excessive, over-broad laws intended only to prop up an obsolete business model at the expense of our freedoms and our well-being. There are other ways to deal with copyright violations besides these.

Generating a 921,600 baud clock on a Cypress PSoC 1

June 5th, 2011

I had some other programming philosophy thing I wanted to post here after the last one, but I can’t remember what it was. So I figured I’d share some of the neat tricks I’ve learned over the years. Here’s one.

Lantronix XPort

A few years ago I was working with some Lantronix XPort devices. These are nifty little modules that let you connect a serial port to Ethernet. They cost about US$50 each, and for simple designs can save tons of development time. (You can also use their developer kit to make custom designs.)

We were using them as shipped, only changing the settings. The thing is, they only use standard baud rates, from 300 bps to 921600 bps.

Standard baud rates are a pain in the neck to derive from normal CPU clocks, which tend to use nice, even numbers in the millions with lots of zeros. The Cypress PSoC 1, for example, has a 24,000,000 Hz system clock. Now, you could add an external 7.3728 MHz oscillator to your design, but that just drives the cost up more. We didn’t want to do that.

I mentioned 7.3728 MHz specifically because the Cypress UARTs require an input clock of 8 times the bit rate. And therein lies the problem: your fastest clock on the Cypress is 48 MHz (SysClk*2), but you need to divide by 13 to get (close to) your nominal bit rate, and 48÷13 gives you ≈3.692 MHz. Right around half what you need.

Well, my colleague came up with a clever way around this. The only drawback is that it uses some resources you may not be able to spare. Fortunately we were using CY8C29466 parts, which have lots of digital blocks.

  • You need two digital blocks for this baud rate generator.
  • You need two adjacent output rows.
  • You need an input row corresponding to the lesser of the two output rows.
  • You need a Global Output column and its corresponding Global Input column, and they must correspond to the lesser of the two output rows.

Plus of course you need a digital block for the transmitter and another for the receiver. If you’re doing full duplex on a 4-block part such as something in the 24xxx series, you’ve just run out of digital blocks.

Here’s what you do:

  1. Place two 8-bit PWM blocks.
  2. Set them both thus: Clock=SysClk*2, Enable=High, CompareOut=(two adjacent output rows), TerminalCountOut=None, Period=12, PulseWidth=6, CompareType=Less Than, InterruptType doesn’t matter (I set to Terminal Count by habit), ClockSync=Unsynchronized, InvertEnable=Normal. Note that I actually set the second PWM’s PulseWidth=5 but it’s not critical. It’s going to be asymmetrical either way.
  3. Click on the Digital Interconnect for the lesser of the output rows you chose (the one that’s closer to the top of the page). Now set the LogicTable Select to A_XOR_B and set the output to whatever Global Output column you chose.
  4. Click the Global Output, set the Interconnect to OutputToInput.
  5. Click on the Digital Interconnect for the corresponding input row. Select the input column you just interconnected. Set the synchronization to Async.
  6. Place your serial parts. Set the clock input to the input row you just set up. Set ClockSync to Sync to SysClk*2.

You should end up with something like this. Click on the image below to see the full size.

Screenshot of PSoC Designer page for 921,600 bps generator

Now comes the tricky part. When you enable the PWMs in your code you need to make sure their output timing overlaps. I don’t know what it would take in C, but in assembly it’s a matter of adding NOPs between the two macro calls until the output does what you want.

_main:
    PWM8_1_Start_M
    nop                    ; NOPs take 4 cycles each. I just stuck in 2 for illustration.
    nop                    ; I don't actually remember how many I added to make it work.
    PWM8_2_Start_M         ; This is an OR REG[expr],expr which takes 9 cycles.

I was never able to make the output work properly by calculating the exact timing; I always had to fiddle to get it right. You might have more luck. Or skill. The important thing is that they be out of phase, and the transitions happen at different times. Also that once it is written into the code the number of CPU cycles between start statements is fixed; once it works, it will work every time.

The easiest way to check is, of course, to temporarily (or permanently if you like) hook the generated signal to an output pin and attach an oscilloscope. If you were to view all three signals you might see something like this. (I did this in Excel, of course; I don’t have access to my original ’scope data. The actual signals would never look so clean.) Click on the image to see the full size.

Overlapping 460.8k inputs and a 921.6k output

Overlapping 460.8k inputs and the 921.6k output you get by XORing them

The actual amount of overlap is not critical. Obviously it will never be symmetrical, but it goes through a ÷8 process in the UART anyway. The important thing is that the output is twice the frequency of the inputs.

It’s pretty clever, and I’m not even bragging — it was my colleague’s idea. I just made it work.

It’s Not Always Best to Optimize to the Worst Case

March 5th, 2011

The “worst case” in this post usually refers to time, since time is a somewhat flexible resource. I’m not saying that you won’t have some severe constraints – if you didn’t, you wouldn’t need to optimize for it, and my examples would be pointless – but it’s a more flexible resource to work with than, say, the size of your ROM. If you have 4k of ROM space, that’s it – you can’t shuffle things around to make space, you can’t delay one section of code to give another more space. If you run out you either need to find a way to optimize the code better or eliminate some features. And in fact, you might have to sacrifice some time somewhere – in a section of code where time isn’t as important – to make the code more space-efficient.

Mind you, time is a lot trickier to work with. For one thing, you can’t just pick it up and look at it, or read a number from an output file. What’s even worse is that in the computer world you’re usually working in microseconds. Or nanoseconds, though I maintain that by the time you’ve gotten into hundredths of a second you’ve passed the threshold of human perception anyway – milliseconds, microseconds, nanoseconds or femtoseconds, you need tools to examine them in any case.

“Worst case” refers to a section of code that periodically or occasionally performs a certain service, but the amount of time it takes varies depending on circumstances. Time is virtually always a constraint when you’re working in embedded systems; they often require real-time performance, which means that if certain functions don’t happen exactly when they’re needed, things break. Even non-real-time systems can only tolerate a certain amount of lassitude. These days I’m mostly working with switches and lights; a tenth of a second round trip is my usual limit, and even that has some elbow room. (And a tenth of a second is at least 3 in geek years!) But if a process takes 10 seconds to complete and my user needs to roll the airplane – if he moves the yoke and nothing happens for 10 seconds because of that long process he’s going to want to have a word with me. In a dark alley. With a couple of rather beefy men named Bruno and Luigi.

Chances are that 10 second process can be interrupted, chopped up, or eliminated. But there are tasks that, while shorter, cannot be interrupted. …Such as the interrupts! Interrupts happen because they are by their very nature time-constrained – you interrupt the normal flow of the process, which can wait for a bit, to do something that can’t wait. However, you want them over with as quickly as possible. Especially periodic interrupts! One absolute must is that a periodic interrupt must service quickly enough that a reasonable amount of processing can be done before the next interrupt. (Actually true of all interrupts, of course, but periodic ones are known to happen at a certain fixed rate because they’re, well, periodic. You won’t get a burst of activity now and a lull later.)

So you get in and out of your periodic interrupt as quickly as possible. How? Several methods can be employed (including setting flags or counters and doing as much of your processing as tolerable in the background loop). But usually you just optimize the heck out of it. Find the best, fastest way to accomplish the task. And if the task time is variable you try to choose your decision branches so they’ll average out to the shortest time possible. If for example it takes longer to take a conditional branch than not to, and the decision is asymmetric (it’s likely to take one branch more often than the other), you try to arrange your code such that it drops through on the more frequent decision and only takes the longer path when it has to. Most systems (not all) can tolerate the occasional long interrupt better than a steady stream of slightly shorter ones as long as the average is short enough.

Fake-Time Clock

Now on to our first scenario: a real-time clock on a slow processor. (It’s also me bragging a bit. Cope.)

The year is 1981. The processor is an 8085 running at something like 4 MHz. Back in those days programmers were real programmers ;) and resources were much thinner than they usually are today. A modern applications programmer would be horrified to be limited to 32k of program space and 4k of RAM, but to me that’s luxury. This 1980s system might have sported an entire 2K bytes of ROM, if we were lucky.

There was a 60 cycle interrupt (generated from the incoming AC power of course). There were a number of functions that had to be performed in that interrupt and one of them was a real-time clock. The code went something like this:

  • Add 1 to the 60th-of-a-second counter.
  • Did we reach 60 (60/60ths, or one second)? If so, set to 0 and add one to the 1-second counter.
  • Did we reach 10 seconds? If so, set to 0 and add one to the 10-second counter.
  • Did we reach 6 (60 seconds)? If so, set to 0 and add one to the 1-minute counter.

…and so on until we counted 24 hours, at which time we wrapped back to zero. Obviously this is a BCD value that allowed us to pluck the whole time from the array and easily convert to human-readable. We could of course have post-processed it instead but dividing by 10 without APU support is painful. You’ve seen my divide functions, right? Look at my last few posts. Those take hundreds of cycles to complete, and we’d need at least three of them (more if we kept time as a single “seconds since midnight” value) on a terribly slow processor… this method makes sense.

Our real-time clock ran almost exactly half speed. It took two seconds for it to count one second.

OK, class, what’s the worst case path here? Answer: rolling over at midnight. That requires clearing every single value to zero.

Mark (not his real name) decided to optimize for the worst case. I don’t remember all the details but it amounts to: he didn’t actually test for the carry, he simply arranged each branch in the decision tree to perform the calculations as though the previous one had gotten a carry. No, he wasn’t incrementing when he shouldn’t – he was simply adding carries, which means most of the time he was adding zeros all the way up to the top. So his code was keeping time correctly, and within the constraints of the environment was very efficient at handling the worst case.

But it was taking so long that it ran into the next interrupt! The system could tolerate one of those in a row, occasionally – in fact it could probably tolerate one of those on every other interrupt. But when it happened on every interrupt it meant that we’d get the next interrupt partway through the current one, so it wouldn’t be processed until we finished the current one. Eventually we’d miss an interrupt because the flag was still set from the one we hadn’t processed when we got another. It just worked out that we were falling behind enough to miss every second interrupt.

Fortunately we had an In-Circuit Emulator (and in those days they really were In-Circuit Emulators) so I could see where the timing error was. A simple code review showed code that was not how I would have written it, but perfectly valid: there was no actual bug. The timing, not the logic, was the issue. (Not that I’m slamming code reviews. Please don’t think for a second that I’m discouraging code reviews! A good developer uses all the tools at his disposal.)

Here’s the thing:

  • I would have written code that dropped out as soon as we didn’t get a carry. (I think most people would have. Not saying Mark was bad for going a different way, just that I’m not trying to claim I’m unique. Or superior.)
  • Yes, roll-overs will take even longer in my code, because rolling over takes the longer path and I’ve added overhead to that path, but…
  • …the system could tolerate an occasional long interrupt, and I could guarantee that you would never have a rollover more often than once every 60 interrupts.

So once in a while, no more than once per second (actually probably less often than once per hour), the interrupt would stretch into the next interrupt — one time — and then it would recover during the following interrupt. All I did then was rearrange the code to work the way I would have written it in the first place and the problem went away. Poof.

Multiplexing 7-Segment Displays

This one is more recent, less than two years ago. I was writing new code to handle old and new Avionics panels for our simulator. Panels vary as to the number of digits; the largest has 10 actual digits, plus several LED indicators that are wired into two more digit positions.

I can drive one digit at a time. Well, I can drive as many as I want, but they sort of get all mixed in together, so I have to drive one digit at a time. They take turns. Very quickly, because we want them to look solid to the human eye. Multiplexed.

(I don’t know why I keep defining these terms. Chances are if you’re reading this and you don’t already know what they mean you’re not an electronics or computer geek, and if you’re not then you’re probably a masochist, because this isn’t exactly pulse-pounding, riveting action-adventure stuff. Habit, I guess, not assuming my audience knows everything I do.)

LEDs aren’t very predictable at being dimmed by controlling their current, unlike incandescent bulbs, but they’re very fast, so you dim them by turning them on-and-off very quickly. The longer they’re turned on, on average, the brighter they are. Multiplexing does exactly that – so the more digits you have to multiplex, the dimmer each LED is, because each LED gets a smaller share of the time.

The circuit these LED digits are on is a simple 5v driver for each segment, and a constant-current, open-collector (or open-drain or whatever) driver to sink each digit. Since the sinking driver is constant-current and the LEDs are driven in parallel it means that the segments are sharing the current between them. The more segments are on, the dimmer they all are.

So we have a variable brightness between the digits on a display (depending on how many segments are lit) and a variable brightness between the displays (depending on how many digits must be multiplexed) and a desire for all the lights on the panel to be approximately the same brightness because it looks better.

So first I decide how quickly I have to refresh the displays to prevent flicker. That’s the amount of time I have. Then I have to decide how frequently I can tolerate interrupting. That gives me a total of number of interrupts or “ticks” I can apply to the display. The number of digits is 12 – so I can never get the ADF brighter than 1/12th of the LEDs’ full brightness, and so all the other displays must be dimmed to the same value. So I can spend an average of n ticks per digit, which is the number of ticks I have per cycle divided by 12.

Worst case is the ADF, with 12 digits, displaying 8.8.8.8.8. 8.8.8.8.8. (that’s 10 real digits plus several individual LEDs wired in place of two extra digits) because the decimal point is a segment, too.

Next I make a list: how many ticks to keep a digit turned on per cycle, based on how many segments are lit, which depends on the value of the digit (“1” lights two segments, “2” lights five segments, etc.) plus 1 if the decimal point is lit. A digit that has few segments lit is brighter, so I spend less time lighting that digit before moving on, so it compensates until the digits are the same brightness no matter how many segments are lit. (I experimented and compared and adjusted until the differences were imperceptible unless you were staring right at it LOOKING for differences.)

Then I have a cycle time total that is fixed, and I add a 13th “virtual” digit whose time is whatever is left over after the other digits are lit. That way if they’re all very short it spends a lot of time idling, otherwise the digits get brighter.

Starting with the worst case, I have n ticks per digit average, and if all segments of all digits are lit then I can’t spend more than n ticks per digit – my average becomes my maximum. But that’s pretty dim. And I end up with a dilemma, because I don’t want the digits too dim, but I want them to all be the same brightness on all displays!

And then Eric (his real name), when confronted with this conundrum, asked a pertinent question: how often do you really have all the segments lit?

Well, actually, the answer is: during testing. There’s pre-sale bench testing, and there’s a test button on one of the panels, but during normal operation we will never have all the segments lit. Period. We Just Won’t.

In fact, in most of the displays the values are constrained; so for example, the leftmost digit in most scenarios will never be anything except “1”. Another savings is that there will be only one decimal point lit in any group of 5 digits. On some displays the last digit will always be 0, 2, or 5.

And so on. I started with the worst case and a list of likely values and pared the theoretical away from the practical and came up with numbers I could live with. I added detection for the “number of remaining ticks this cycle” to never reach zero, which means if we ever had too many segments lit it would just stretch the cycle a bit and the digits would still display correctly, if slightly dimmer. It means some digits can spend more time than average because others will spend less, and my brightness goes up during the times it matters and only down during the times it doesn’t.

And you know what? Not only does it work, but it turns out that even when we’re testing you don’t notice the digits dimming. It’s just not enough to matter, even when I’m being my most picky.

Of course I’m still optimizing for a worst case, but only the worst normal case, not the worst possible case. I just needed somebody to point out my rectal-cranial inversion so I could stop spinning my wheels.

A further note: my boss (not his real name) is one of the best people I know for finding usability flaws, and he’d spot an inconsistent display sooner or later, and probably sooner, and he’d definitely say something. After more than a year he hasn’t said a single word. If he doesn’t see a problem then there isn’t a problem.

blogG0d

Hiatus

October 30th, 2010

“Hiatus” means “gap” or “interruption” or any of a half dozen other similar meanings.

In this case it means “I haven’t written in a while because Life Happens.” And in this case “Life Happens” means that my wife left me and I’ve moved into a duplex by myself and, well, let’s just say my life’s had some upheaval.

Not by myself for long. As I write this my wife is driving here with my daughter, who will be moving in with me. So more disruption. A more positive type this time but when you live by yourself you can get away with minimal furniture and bare necessities, but my daughter needs her own space and a place to sleep. Other than the floor. :)

The good news is they’re now about 50 miles away and approaching rapidly, including my younger daughter, and I’m pretty excited. Not excited enough to write a real blog post right now, there are still things I want to finish up before they get here. :)

But I wanted to let my readers, should I have any, know what is going on. And there you have it.

P.S. I did end up with one of the cats. It was inevitable. She’ll be very happy to see my daughter, I think, though it’s hard to tell with cats. I said “D___ is moving in, she’ll be here around noon today” and the cat said “Why aren’t you petting me? Now open the door, I want to go outside.” I think she’s just being coy.

PIC-16 Bitwise Long Divide

May 15th, 2010

I promised I’d post my long divide function for the PIC. There was a resounding silence — no surprise considering most if not all the comments to date have been spam.

I just hope the next poor slob who needs an assembly language divide function for the PIC can find this page with a web search.

So… the function I wrote for the PSoC part is pretty universal. You should be able to port it to your favorite 8-bit part with minimum fuss. However, the PIC-16 and PIC-14 micros have a really funky hole in their operation set: they don’t do add with carry (or subtract with borrow). I have no objection to RISC architecture, though I’m not entirely convinced that it’s inherently better than CISC — I simply have no real opinion on the subject — but come on, guys. You’ve gotten carried away with this one.  Add with carry? Just how crazy are you trying to be? It’s not a CISC operation and it would have made multi-byte arithmetic considerably simpler. This is probably my single biggest complaint about the PIC micros.

So I’ve created a separate PIC port for this routine, and I’m posting it here separately, because it’s a pain in the neck to port the PSoC version.

The other thing I’ve done here is taken advantage of Microchip’s much more extensive (read: “not brain dead”) preprocessor and created a version of the code where you can choose (at compile-time) how many bytes the dividend and divisor occupy. This makes it simple to port to however many bits of resolution you need, as long as it can be expressed in multiples of 8.

Enjoy!

;
; Copyright 2009 Jeffrey J. Nonken
; License: permission granted for unlimited use and unlimited derived use.
; If you use this, I'd appreciate it if you'd give me credit.
; Just leave these comments intact or mention me in your comments. I don't need
; money for this, but I'd appreciate credit where it's due.
; (Though if you want to send me money, I won't refuse it. :)
;
; Like what I did here? Feel free to offer me a job. http://jeffrey.nonken.net/
;
; Let me know if you have any improvements. PLEASE let me know if you find bugs.

;
; This is an unsigned integer divide function, x bits divided by y bits.
; Stuff your values into "dividend" and "divisor" and call the function (div24by16 in this sample).
; Note that "dividend" is stored as an x+1-byte value, the high byte is used internally for overflow.
; Any value you place here will get stomped on. Nothing is preserved. If you need the dividend or
; divisor later, make sure they're stored someplace else too.
; The remainder is computed as a side-effect of the divide but we do not attempt to preserve it.
; A few lines of extra code could be written to return the remainder in "divisor", but it will take
; some shifting around to justify the value properly. Use the inverse of "mask" on the result in
; "dividend" and the value calculated in "shift_count" to figure that one out. Don't forget to
; include the overflow bit.
;
; I originally created this algorithm in the 1980s to work on an 8086 processor.
; Later I ported it to the Cypress M8 processor for 24 over 16. (That code is in the Times Square Ball, by the way.)
; In order to avoid some tricky stuff on the PIC 16 series processor I ported it to "C", ran a test on a simulator,
; debugged it, then got the PICC C compiler to generate assembly code for 16-over-16 bit. I hand-optimized the result.
; After porting it back to 24-over-16 bits I decided I could use constants and macros to create something that
; could handle any number of bytes for the dividend and divisor -- between compiles, I mean, not run-time.
; (That could happen too but it would be much less efficient. I already did a couple compromises just for this.)
; This is what's left.
;
; So set your dividend and divisor size (in bytes) at the start of this, change the function name to something
; appropriate, and import it into your project. Just make sure the dividend has as many bytes as the divisor,
; or more, when you pick DIVIDEND_SIZE and DIVISOR_SIZE. No, that doesn't refer to the values you pass to
; the function -- if dividend is smaller than divisor at run-time you'll just get zero as a quotient.
;
; I wasn't using the PICC C compiler to write the code for the project because it's not well-enough integrated
; into MPLAB to do simulation or emulation. I used Microchip's C18 compiler set to an 18F4455 MCU for the
; test/debug cycle, then used PICC C to generate the assembly.
;
; It's not pretty, but it works.
;

        INCLUDE defaults.inc

; Set your word sizes here.
DIVIDEND_SIZE   equ     3               ; Dividend size in bytes. Also the quotient size.
DIVISOR_SIZE    equ     2               ; Divisor size in bytes.
; End of configurable equates.

 if (DIVIDEND_SIZE < DIVISOR_SIZE)      ; One caveat: dividend can't be smaller than divisor.
        error   "DIVIDEND_SIZE must be greater than or equal to DIVISOR_SIZE!"
 endif
MASK_SIZE       equ     DIVIDEND_SIZE   ; Number of bytes in the mask the same as dividend.
DIVISOR_MSB     equ     DIVISOR_SIZE-1  ; MSB offset for the divisor.
DIVIDEND_MSB    equ     DIVIDEND_SIZE-1 ; MSB offset. Doesn't include the overflow byte.

        UDATA

        GLOBAL dividend,divisor,quotient
quotient:                               ; Outgoing quotient is in dividend. Doesn't have to be, but if not,
                                        ;  you'll have to pre-clear the quotient and shift it separately.
                                        ;  I don't think you'll save anything doing that.
dividend:       RES     DIVIDEND_SIZE+1 ; Incoming dividend, outgoing quotient.
                                        ;  We reserve an extra byte for overflow (1 bit).

divisor:        RES     DIVISOR_SIZE    ; Incoming divisor. This gets destroyed too. Just shifted, actually.
mask:           RES     DIVIDEND_SIZE   ; Mask used to separate quotient from remainder.
shift_count:    RES     1               ; Number of shifts required to complete the operation.
                                        ;  Also number of bits in the quotient mask.
mask_count:                             ; Counter used to generate the mask.
                                        ;  (Re-use carry, we don't use both at one time.)
carry:          RES     1               ; Holding register for the carry bit for multi-byte arithmetic.

        CODE

;
; This function tries to be efficient by shifting both values to the far left to minimize the number of repititions
; through the main loop. That comes at a cost, of course, but I expect that most of the value pairs I use will do
; better this way.
;
; Here's the original comment block, modified. Note that it was written for 24x16 divide.
;----------------------------------------------------------------------------------
; Divide a 24-bit number by a 16-bit number. Leaves the quotient in the dividend
; (numerator) and throws away the remainder.
;
; This works like a classic long division; it shifts the numerator and denominator
; to the top of their respective words (so they are the same apparent magnitude), counts
; the number of shifts and uses it to calculate the number of subtracts needed as
; well as generate the mask that eliminates the remainder.
;
; This can easily be extended or reduced to accomodate various word sizes. There
; should be no problem dividing a smaller word by a larger, but you will have
; to virtually extend the numerator with zeros. Since the initial shift factor will
; start out negative you will need to fiddle with that; you do not want to quit
; if you generate a carry, but instead test the sign of the result. (You still will
; want to jump to .zero_quotient if the shift factor is negative. Note that a shift
; factor of zero is valid.) [This version tests for zero, not carry, so that's
; no longer true. Lowest valid shift factor is 1. Don't try to use a smaller dividend
; that divisor unless you want to modify the code. I doubt it's worth the trouble.]
;
; If you are expecting small values in the numerator or denominator, you can
; check the high byte for 0 and do 8-bit shifts before doing the final 1-bit shifts.
; Since the 8-bit-shift loop starts with a cmp (or tst) and a jump, the cost is
; only about 13 cycles if it is not necessary. [Cypress cycles.]
;
; If you need to keep the remainder it gets a bit more complicated. The calculated
; count, and the final calculated mask, will tell you where the remainder is in
; the numerator: the remainder's LSb is just above the highest 1-bit in the mask.
; You may need to copy it out, mask and shift it before saving the quotient. You
; can also contrive to save the quotient into a different word rather than into
; the numerator, in which case you only need to shift the remainder back around.
; However, this will cost you more shifts during the calculation (you need to
; shift the numerator and the quotient separately).
;
; You can save yourself several bytes of RAM storage by calculating the final
; mask on-the-fly.
;----------------------------------------------------------------------------------
; Please note several things.
; 1) This is little-endian. Stuff/retrieve the LSB of the value in the lowest
;    address of the dividend/divisor/quotient.
; 2) This is entirely, completely, 100% unsigned. Want signed? Write your own,
;    or preserve the signs and convert to absolute values before the divide,
;    calculate the sign of the quotient and complement if required. If you need a
;    reminder: the quotient's sign will be the exclusive-or of the two inputs'
;    signs. The remainder, should you choose to keep it, is the same sign as the
;    dividend.
; 3) Make sure you set (or clear) ALL the input bytes (dividend and divisor)
;    before calling. That should be obvious, but it's worth the reminder.

        GLOBAL div24by16
div24by16:
; Check for boundary conditions and set up the divide.
        clrf    dividend+DIVIDEND_SIZE  ; Clear the extra byte.
; Look for divide by zero.
        movf    divisor,w               ; See if the divisor is zero. Start with the LSB.
i = 1
        while i < DIVISOR_SIZE
        iorwf   divisor+i,w             ; OR in the rest of the bytes.
i = i + 1
        endw
        skipz                           ; Any 'on'-bits will show here.
        goto    divisor_not_zero        ; Situation normal, keep going.

; Oh noes! Divide by zero. Can't have that.
        movlw   0FFh                    ; Divide by zero? Get as close to infinity as we can.
i = 0
        while i < DIVIDEND_SIZE
        movwf   dividend+i
i = i + 1
        endw
        return;

; Passed the first hurdle: we have a non-zero divisor.
divisor_not_zero:
;divide.c: 62: }
;divide.c: 63: shift_count = 1;
                                        ; Shift-count is the difference in the number of bits, plus 1.
        movlw   1 + ((DIVIDEND_SIZE-DIVISOR_SIZE) << 3)
        movwf   shift_count
;divide.c: 64: while ((divisor & 0x8000) == 0)
        goto    divisor_shift_start

; Start with zero. Is the dividend larger? Add 8 bits for every byte larger.
; So in a 24x16 divide we start with 8.
; Add 1 for every time we shift the divisor left.
; Subtract 1 for every time we shift the dividend left.
; Plus one so we can use decfsz for the loop test. (Otherwise we have to test for borrow.)
divisor_shift_loop:
;divide.c: 65: {
;divide.c: 66: ++shift_count;
        incf    shift_count,f           ; Count up every time we shift the divisor left.
;divide.c: 67: divisor <<= 1;
        clrc
i = 0
        while i < DIVISOR_SIZE
        rlf     divisor+i,f             ; Shift all bytes left, starting with the LSB.
i = i + 1
        endw
divisor_shift_start:
        btfss   divisor+DIVISOR_MSB,7   ; Stop shifting once the high bit is non-zero.
        goto    divisor_shift_loop

;divide.c: 68: }
;divide.c: 71: while ((dividend & 0x8000) == 0)
        goto    dividend_shift_start

; OK, now shift the dividend, but this time count backwards on the shift count.
dividend_shift_loop:
;divide.c: 72: {
;divide.c: 73: if (--shift_count == 0)
        decfsz  shift_count,f                   ; If it reaches zero that means the divisor is larger!
        goto    shift_not_zero                  ; If that happens, our quotient is guaranteed to be zero.
; Oops. The shift value reached zero, which means the dividend's value is smaller than the divisor's,
; which means the quotient is zero. If you want to return the remainder you'll have to stick in some
; extra code here.
        ;divide.c: 74: {
;divide.c: 75: dividend = 0;
i = 0
        while i < DIVIDEND_SIZE
        clrf    dividend+i                      ; Quotient is zero. Just clear it out the easy way.
i = i + 1
        endw
        return                                  ; Render the output and return.
; Still some meat on this, so keep going.
shift_not_zero:
;divide.c: 77: }
;divide.c: 78: dividend <<= 1;
        clrc                                    ; Clear the carry so as not to contaminate the value.
i = 0
        while i < DIVIDEND_SIZE
        rlf     dividend+i,f                    ; Shift all dividend bytes left starting with the LSB.
i = i + 1
        endw
                                                ; We'll never shift anything into dividend+DIVIDEND_SIZE in this loop.
dividend_shift_start:
        btfss   dividend+DIVIDEND_MSB,7         ; See if the high bit is set.
        goto    dividend_shift_loop             ; If not, repeat until it is.
;divide.c: 79: }
;divide.c: 80: {
;divide.c: 81: unsigned char mask_count = shift_count;

; Now that we have shift_count, we know how big the quotient is. Use that to generate a mask.
; If you want to save RAM you can generate the mask on-the-fly just before returning.
; Just make sure you copy off mask_count first, because we destroy shift_count.
        movf    shift_count,w
        movwf   mask_count                      ; shift_count is guaranteed to be at least 1.
;divide.c: 82: mask = 0;
i = 0
        while i < MASK_SIZE
        clrf    mask+i                          ; Start by clearing out the mask.
i = i + 1
        endw
;divide.c: 88: while (mask_count)
mask_shift_loop:
;divide.c: 89: {
;divide.c: 90: mask = (mask << 1) | 1;

; (Notice we don't clear the carry first. Don't need to.)
i = 0
        while i < MASK_SIZE
        rlf     mask+i,f                        ; For every count, shift left by 1...
i = i + 1
        endw
        bsf     (mask),0                        ;  ... and set the low bit.
;divide.c: 91: --mask_count;
        decfsz  mask_count,f                    ; Count down and repeat ad nauseum.
        goto    mask_shift_loop
;divide.c: 92: }
;divide.c: 93: }
;divide.c: 94: do

; Hey, guess what? After all that we finally get to do some division! Hurrah!
looping:
;divide.c: 95: {
;divide.c: 96: if (divisor <= dividend)
; Remember your grade-school long division. Line up the values on the left. Can you subtract?
; If not, shift the divisor right 1 digit and try again. Of course, that's in base 10, so you need to
; find a multiplier to put up top. Easier for us here, it's just zero or 1. Once you do that
; multiply the divisor by the multiplier and subtract the result from the dividend.
; Move to the next column and repeat until you run out of dividend.
; Whatever is left is the remainder.
; So: Find out if we can subtract. If we can, do it and stick a 1 in the quotient, then shift the dividend left.
; If not, just shift the dividend left. Note that if we do that, we might end up with an extra bit off the high end.
; That's why we reserved an extra byte for the dividend.
        movf    (dividend+DIVIDEND_SIZE),w      ; First, see if the overflow bit is set.
        skipz
        goto    dividend_is_bigger              ; If dividend overflow byte is non-zero, dividend is bigger by definition.
                                                              ; Otherwise drop through to compare the two values.
; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
; Can't really make a local label in a while/endw (assembler limitation) so we're making a macro to call.
        sub_byte        macro
        local next
        movf    (divisor+j),w
        subwf   (dividend+i),w
        skipnz
        goto    next                            ; If they're the same, drop through.
                                                ;  Either try the next one or if this is the last, allow the subtraction.
        skipnb                                  ; They're not the same, see which one is bigger.
        goto    divisor_is_bigger
 if j > 0                                       ; If this is the last byte, just drop through.
        goto    dividend_is_bigger
 endif
next:
        endm
; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

; Start from the highest byte and compare. It's just a gt/lt/eq comparison so we don't need borrows and so on.
; As soon as we see one is larger than the other we can stop. If they're equal it's OK to do the subtract,
; so we treat that the same as the dividend being larger.
i = DIVIDEND_MSB
j = DIVISOR_MSB                                 ; Start with the most significant byte of each value.
        while j >= 0                            ; Keep trying until we run out of divisor.
        sub_byte                                ; Run the macro (above). Any difference jumps out of the while/endw "loop".
j = j - 1
i = i - 1                                       ; Still the same, try the next byte.
        endw
; Both values being the same is OK too. Drop through to dividend_is_bigger.

; If we get here, either the dividend is bigger or they're exactly the same. Either way we can do the subtraction.
dividend_is_bigger:
;divide.c: 97: {
;divide.c: 98: dividend = ((dividend - divisor) << 1) | 1;
j = 0                                           ; Start the divisor index at zero.
i = DIVIDEND_MSB - DIVISOR_MSB                  ; Count backwards from the dividend MSB so we do the same number of bytes.
        while j < DIVISOR_SIZE ; First, take care of the borrow from the previous byte.
 if j > 0                                       ; No borrow waiting on the first byte!
        movf    carry,w                         ; Get the borrow from the previous byte.
        subwf   dividend+i,f                    ; Subtract it from the current byte.
        clrf    carry                           ; Clear the borrow...
        skipnb
        incf    carry,f                         ;  ...and set it again if we have a new one.
 else
        clrf    carry                           ; First byte, just clear the borrow.
 endif
 ; Now do the subtract.
        movf    divisor+j,w                     ; Get the divisor,
        subwf   dividend+i,f                    ; subtract from the dividend.
        skipnb
        incf    carry,f                         ; Add to the borrow.
                                                ; The borrow, "carry", will be 0 or 1 here, carried forward to the next byte.
j = j + 1
i = i + 1                                       ; Move on to the next byte.
        endw
        movf    carry,w
        subwf   (dividend+DIVIDEND_SIZE),f      ; Extend the borrow to the overflow byte.
i = 0
        while i <= DIVIDEND_SIZE
        rlf     dividend+i,f                    ; Now shift the dividend left for the next subtract.
i = i + 1
        endw
        bsf     (dividend+0),0                  ; This subtract succeeded, so record a "1" bit in the quotient.
;divide.c: 99: }
        goto    loop_continue

; If we get here the subtract failed; just shift the dividend left and record a "0" bit in the quotient.
; This is where the dividend overflow byte comes into play. We wouldn't need it otherwise.
divisor_is_bigger:
;divide.c: 100: else
;divide.c: 101: {
;divide.c: 102: dividend = (dividend << 1);
        clrc                                    ; Clear the carry ahead of time (that "0" we mentioned above).
i = 0
        while i <= DIVIDEND_SIZE                ; (Note this says "less-than-or-equal" to accomodate the overflow.)
        rlf     dividend+i,f                    ; Shift all the bytes left, including the overflow.
i = i + 1
        endw

; Decide whether we're finished the divide, and loop if not.
loop_continue:
;divide.c: 103: }
;divide.c: 104: } while (--shift_count);
        decfsz  shift_count,f                   ; Count down, skip if finished.
        goto    looping                         ; Not finished, start another iteration.

; Hey, we're finally finished! Strip away the remainder and return.
; The mask can be generated on-the-fly here if desired to save RAM, though it might cost cycles.
; Then again, maybe not.
; If you want to return the remainder, now's the time.
;divide.c: 105: dividend &= mask;
i = 0
        while i < DIVIDEND_SIZE
        movf    mask+i,w
        andwf   dividend+i,f                    ; Strip away the remainder, leaving only the quotient.
i = i + 1
        endw
;       clrf    dividend + DIVIDEND_SIZE        ; Clear the overflow byte. Probably redundant.
;divide.c: 106: }
        return

        end

Too much spam

April 19th, 2010

I’m sorry to have to do this so early in the game, but the spam is already somewhere between “annoying” and “overwhelming”. I’m going to start requiring registration to post comments.

Many thanks to those real humans who have taking the time to read and especially comment on my blog.

PSoC M8 Bitwise Long Divide

March 6th, 2010

One problem with 8-bit micros is that their math processing is usually limited to addition and subtraction. You can do a lot with addition and subtraction, but sometimes you need to do multiplication and division. Even the mighty 8086 didn’t get its FPU built in until the 80386 (and it was optional until the Pentium, if I remember correctly) though you could get an external FPU.

Of the two processors I use most (the Microchip PIC18 and the Cypress M8) each has an 8-bit multiply function accessible via registers (though the M8 is a signed multiply — who was the genius who thought that up? It makes a mess of doing multi-word multiplies). And multiply is comparatively simple to write.

But to do a proper divide requires more work. Especially if you don’t want to just subtract the divisor from the dividend and add 1 to the quotient until you run out of dividend. It works, but it’s terribly slow.

I’m not very strong on math so I fell back on what I knew. What I knew was long division as taught to me in elementary school. You know — subtract a multiple of the divisor from the top of the dividend, record the multiplier, calculate the remainder, then do it again on the remainder. Wash, rinse, repeat until you run out of dividend or patience. (OK, not the best description, but you already know the drill.)

So that’s what I did for my bitwise long divide. I first wrote it for the 8086, later I ported it to the M8, then the PIC16, then the M8 again when I lost the first code. I’ve ported it to several different sizes of dividend and divisor; with some work I could probably make it run-time adaptable but of course that would only slow it down. I did make a macro version for the PIC16 that uses preprocessor directives (while/endw mostly) to let you generate code for any size words you like. I don’t think I can do that for the M8, their preprocessor is very primitive. The original M8 code is in the Times Square Ball firmware, by the way. What I’ll be giving you has been ported twice since then, but it’s essentially the same. (I’ll give you the PIC16 code in a separate blog post, fear not.)

Here’s the general idea. First, shift both values (dividend and divisor) left until each is at the top of its word. Keep track of how many times you shifted each (plus the difference in word size); that tells you how many bits will be in your quotient (and indirectly your remainder), though it doesn’t say how many of those bits are set to 1, don’t get me wrong. But the quotient will never be any larger than that number of bits. That number also tells you when to stop subtracting; it’s a counter.

Next you do a subtract-test-record-shift cycle exactly like the one in your by-hand long division, except that ours is in binary. You are shifting the dividend, not the divisor (which stays constant).

Now here’s the tricky bit. Every time you do a subtract-shift you discard one bit of significance. That means that a new low bit becomes available in the dividend on each cycle. So instead of putting that bit into a separate quotient (and requiring a second set of shifts — remember that we’re in a slow, 8-bit processor), you can set the result of your subtract into the low bit of the dividend after you’ve shifted it. Of course that means you can’t (easily) test for zero to see if you’re through; that’s why you need that counter. Still, leaving the divide early due to the dividend going to zero is an unusual case, I should think. (And if it is not, in your particular case, due to peculiarities of your particular data set, then you obviously want to optimize your code to take advantage of that.)

That leaves the quotient in the dividend. But what about the leftover bits after the last subtract? Well… that’s your remainder.

Now early on we determined how many bits the quotient would have. We use this to generate a mask that lets us preserve the quotient and throw away the remainder. But what if you need the remainder? That’s a bit more work but it turns out to be exactly the number of bits remaining that aren’t quotient. All you have to do is copy, mask, and shift right by — guess what? — the number of bits in the quotient, the same counter we calculated up front. (Actually you can skip the ‘mask’ step for the remainder since you’re going to shift the quotient into the bit bucket anyway.)

Cool stuff, Maynard.

Unfortunately I haven’t figured out how to attach files to this blog (except media). I can of course upload elsewhere and link to it. Meantime I’m going to try to simply paste it into the blog.

Bonus! Here I am giving you two different word sizes, 24/16 and 32/16. The differences should be obvious; it should help if you ever need to port to other word sizes. Please note however that the design doesn’t really allow for the divisor to have more bits than the dividend.

One other note in reference to my post “Optimize This!” regarding re-use of variables. Note that I do that here. First, it’s stable, test code; second, you’ll notice that I gave the variable two different, meaningful labels for the two separate uses; third, you’ll notice that I commented it.

On the other hand, I didn’t really comment the code as much as I usually do. I did however leave in the C code (as comments) I used during one porting process.

One last note: notwithstanding the number of bits we counted, when you store the results the quotient word size is the same as the dividend and the remainder is the same word size as the divisor. Again, that’s the general case and your dataset may generate smaller words, optimize for your dataset.

;
; Copyright 2010 Jeffrey J. Nonken
; License: permission granted for unlimited use and unlimited derived use.
; If you use this, I'd appreciate it if you'd give me credit.
; Just leave these comments intact or mention me in your comments. I don't need
; money for this, but I'd appreciate credit where it's due.
; (Though if you want to send me money, I won't refuse it. :)
;
; Like what I did here? Feel free to offer me a job. http://jeffrey.nonken.net/
;
; Let me know if you have any improvements. PLEASE let me know if you find bugs.

;
; This is an integer divide function, 24 bits divided by 16 bits.
; Stuff your 24-bit value into "dividend", 16-bit value into "divisor" and call div24by16.
; Note that "dividend" is a 4-byte value, the high byte is used internally for overflow.
; Any value you place there will get squashed first thing.
; The quotient of the divide will be left in dividend, so make sure your value is safe someplace else.
; For that matter, I destroy divisor, too.
; The remainder is computed as a side-effect of the divide but we do not attempt to preserve it.
; A few lines of extra code could be written to return the remainder in "divisor", but it will take
; some shifting around to justify the value properly. Use the inverse of "mask" on the result in
; "divisor" and the value calculated in "shift_count" to figure that one out. Don't forget to
; include the overflow bit.
;
; I originally created this algorithm in the 1980s to work on an 8086 processor.
; Later I ported it to the Cypress M8 processor for 24 over 16. [That code is in the Times Square Ball, by the way.]
; In order to avoid some tricky stuff on the PIC 16 series processor I ported it to "C", ran a test on a simulator,
; debugged it, then got the PICC C compiler to generate assembly code. I hand-optimized the result.
; This is what's left.
;
; I wasn't using the PICC C compiler to write the code for the project because it's not well-enough integrated
; into MPLAB to do simulation or emulation. I used Microchip's C18 compiler set to an 18F4455 MCU for the
; test/debug cycle, then used PICC C to generate the assembly.
;
; It's not pretty, but it works.
;

    AREA	bss [RAM, REL]

quotient::					; Outgoing quotient is in dividend.
dividend::		BLK	5		; Incoming dividend, outgoing quotient.
divisor::		BLK	2		; Incoming divisor. This gets destroyed.
mask:			BLK	4		; Mask used to separate quotient from remainder.
shift_count:		BLK	1		; Number of shifts required to complete the operation.
                                                ;  Also number of bits in the quotient mask.
mask_count:					; Counter used to generate the mask.
                                                ;  [Re-use carry, we don't use both at one time.]
carry:			BLK	1		; Holding register for the carry bit. Used in the only
                                                ;  true 3-byte operation we have.

    AREA text [ROM, REL, CON]

;
; This function tries to be efficient by shifting both values to the far left to minimize the number of repititions
; through the main loop. That comes at a cost, of course, but I expect that most of the value pairs I use will do
; better this way.
;
; Dividing 1024 by 33 takes about 345 cycles, including the call/return. Just for the record.
;
; Here's the original comment block. Note that it was written for 24x16 divide.
;----------------------------------------------------------------------------------
; Divide a 24-bit number by a 16-bit number. Leaves the quotient in the dividend
; [numerator] and throws away the remainder.
;
; This works like a classic long division; it shifts the numerator and denominator
; to the top of their respective words [so they are the same magnitude], counts
; the number of shifts and uses it to calculate the number of subtracts needed as
; well as the mask that eliminates the remainder.
;
; This can easily be extended or reduced to accomodate various word sizes. There
; should be no problem dividing a smaller word by a larger, but you will have
; to virtually extend the numerator with zeros. Since the initial shift factor will
; start out negative you will need to fiddle with that; you do not want to quit
; if you generate a carry, but instead test the sign of the result. [You still will
; want to jump to .zero_quotient if the shift factor is negative. Note that a shift
; factor of zero is valid.] [This version tests for zero, not carry, so that's
; no longer true. Lowest valid shift factor is 1.]
;
; If you expect to be using long words or want to macro-ize this [or expect to
; adapt it a lot] you may want to use multi-byte words and index into them
; [e.g. [numerator+1] instead of discrete names for each byte. However, the result
; will be endian-dependent. [This one is little-endian.]
;
; If you are expecting small values in the numerator or denominator, you can
; check the high byte for 0 and do 8-bit shifts before doing the final 1-bit shifts.
; Since the 8-bit-shift loop starts with a cmp [or tst] and a jump, the cost is
; only about 13 cycles if it is not necessary. [Cypress cycles.]
;
; If you need to keep the remainder it gets a bit more complicated. The calculated
; count, and the final calculated mask, will tell you where the remainder is in
; the numerator: the remainder's LSb is just above the highest 1-bit in the mask.
; You may need to copy it out, mask and shift it before saving the quotient. You
; can also contrive to save the quotient into a different word rather than into
; the numerator, in which case you only need to shift the remainder back around.
; However, this will cost you more shifts during the calculation [you need to
; shift the numerator and the quotient separately].
;
; You can save yourself several bytes of RAM storage by calculating the final
; mask on-the-fly.
;----------------------------------------------------------------------------------

div24by16::
; Check for boundary conditions and set up the divide.
    mov		[dividend+3],0
; Look for divide by zero.
    mov		a,[divisor+1]		; See if the divisor is zero.
    or		a,[divisor]
    jnz		.divisor_not_zero
; Divide by zero.
    mov		a,0xFF			; Divide by zero? Get as close to infinity as we can.
    mov		[dividend],a
    mov		[dividend+1],a
    mov		[dividend+2],a
    ret

.divisor_not_zero:
;divide.c: 62: }
;divide.c: 63: shift_count = 1;
    mov		[shift_count],1+8			; Difference in #bits between dividend and divisor. Plus 1.
;divide.c: 64: while [[divisor & 0x8000] == 0]
    jmp		.divisor_shift_start

.divisor_shift_loop:
;divide.c: 65: {
;divide.c: 66: ++shift_count;
    inc		[shift_count]
;divide.c: 67: divisor <<= 1;
    asl		[divisor]
    rlc		[divisor+1]
.divisor_shift_start:
    tst		[divisor+1],0b10000000
    jz		.divisor_shift_loop
    jmp		.dividend_shift_start

.dividend_shift_loop:
;divide.c: 72: {
;divide.c: 73: if [--shift_count == 0]
    dec		[shift_count]
    jnz		.l10
;divide.c: 74: {
;divide.c: 75: dividend = 0;
    mov		[dividend],0
    mov		[dividend+1],0
    mov		[dividend+2],0
                ; dividend+3 is already zero.
    ret			; Render the output and return.

.l10:
;divide.c: 77: }
;divide.c: 78: dividend <<= 1;
    asl		[dividend]
    rlc		[dividend+1]
    rlc		[dividend+2]
                                        ; We'll never shift anything into dividend+3 in this loop.
.begin_continue:
.dividend_shift_start:
    tst		[dividend+2],0b10000000		; See if the high bit is set.
    jz		.dividend_shift_loop		; If not, repeat until it is.
;divide.c: 79: }
;divide.c: 80: {
;divide.c: 81: unsigned char mask_count = shift_count;
    mov		[mask_count],[shift_count]	; shift_count is guaranteed to be at least 1.
;divide.c: 82: mask = 0;
    mov		[mask],0
    mov		[mask+1],0
    mov		[mask+2],0
;divide.c: 88: while [mask_count]
.mask_shift_loop:
;divide.c: 89: {
;divide.c: 90: mask = [mask << 1] | 1;
    asl		[mask]
    rlc		[mask+1]
    rlc		[mask+2]
    or		[mask],1
;divide.c: 91: --mask_count;
    dec		[mask_count]
    jnz		.mask_shift_loop
;divide.c: 92: }
;divide.c: 93: }
;divide.c: 94: do

.looping:
;divide.c: 95: {
;divide.c: 96: if [divisor <= dividend]
    mov		a,[dividend+3]			; Dividend has a higher byte than divisor.
    jnz		.dividend_is_bigger		; If dividend high byte is non-zero, dividend is bigger by definition.
    mov		a,[divisor+1]
    sub		a,[dividend+2]
    jnz		.u925
    mov		a,[divisor]
    sub		a,[dividend+1]
    jz		.dividend_is_bigger
.u925:
    jnc		.divisor_is_bigger		; They're not the same, see which one is bigger.
.dividend_is_bigger:
;divide.c: 97: {
;divide.c: 98: dividend = [[dividend - divisor] << 1] | 1;
    mov		a,[divisor]
    sub		[dividend+1],a
    mov		a,[divisor+1]
    sbb		[dividend+2],a
    sbb		[dividend+3],0
    asl		[dividend+0]
    rlc		[dividend+1]
    rlc		[dividend+2]
    rlc		[dividend+3]
    or		[dividend+0],1
;divide.c: 99: }
    jmp		.l19

.divisor_is_bigger:
;divide.c: 100: else
;divide.c: 101: {
;divide.c: 102: dividend = [dividend << 1];
    asl		[dividend+0]
    rlc		[dividend+1]
    rlc		[dividend+2]
    rlc		[dividend+3]

.l19:
;divide.c: 103: }
;divide.c: 104: } while [--shift_count];
    dec		[shift_count]
    jnz		.looping
.u950:

.l16:
;divide.c: 105: dividend &= mask;
    mov		a,[mask]
    and		[dividend],a
    mov		a,[mask+1]
    and		[dividend+1],a
    mov		a,[mask+2]
    and		[dividend+2],a
    mov		[dividend+3],0
;divide.c: 106: }
    ret

;----------------------------------------------------------------------------------
; Divide a 32-bit number by a 16-bit number. Leaves the quotient in the dividend
; [numerator] and throws away the remainder.
;
; This works like a classic long division; it shifts the numerator and denominator
; to the top of their respective words [so they are the same magnitude], counts
; the number of shifts and uses it to calculate the number of subtracts needed as
; well as the mask that eliminates the remainder.
;
; This can easily be extended or reduced to accomodate various word sizes. There
; should be no problem dividing a smaller word by a larger, but you will have
; to virtually extend the numerator with zeros. Since the initial shift factor will
; start out negative you will need to fiddle with that; you do not want to quit
; if you generate a carry, but instead test the sign of the result. [You still will
; want to jump to .zero_quotient if the shift factor is negative. Note that a shift
; factor of zero is valid.] [This version tests for zero, not carry, so that's
; no longer true. Lowest valid shift factor is 1.]
;
; If you expect to be using long words or want to macro-ize this [or expect to
; adapt it a lot] you may want to use multi-byte words and index into them
; [e.g. [numerator+1] instead of discrete names for each byte. However, the result
; will be endian-dependent. [This one is little-endian.]
;
; If you are expecting small values in the numerator or denominator, you can
; check the high byte for 0 and do 8-bit shifts before doing the final 1-bit shifts.
; Since the 8-bit-shift loop starts with a cmp [or tst] and a jump, the cost is
; only about 13 cycles if it is not necessary. [Cypress cycles.]
;
; If you need to keep the remainder it gets a bit more complicated. The calculated
; count, and the final calculated mask, will tell you where the remainder is in
; the numerator: the remainder's LSb is just above the highest 1-bit in the mask.
; You may need to copy it out, mask and shift it before saving the quotient. You
; can also contrive to save the quotient into a different word rather than into
; the numerator, in which case you only need to shift the remainder back around.
; However, this will cost you more shifts during the calculation [you need to
; shift the numerator and the quotient separately].
;
; You can save yourself several bytes of RAM storage by calculating the final
; mask on-the-fly.
;----------------------------------------------------------------------------------

div32by16::
; Check for boundary conditions and set up the divide.
    mov		[dividend+4],0
; Look for divide by zero.
    mov		a,[divisor+1]		; See if the divisor is zero.
    or		a,[divisor]
    jnz		.divisor_not_zero
; Divide by zero.
    mov		a,0xFF			; Divide by zero? Get as close to infinity as we can.
    mov		[dividend],a
    mov		[dividend+1],a
    mov		[dividend+2],a
    mov		[dividend+3],a
    ret

.divisor_not_zero:
;divide.c: 62: }
;divide.c: 63: shift_count = 1;
    mov		[shift_count],1+16			; Difference in #bits between dividend and divisor. Plus 1.
;divide.c: 64: while [[divisor & 0x8000] == 0]
    jmp		.divisor_shift_start

.divisor_shift_loop:
;divide.c: 65: {
;divide.c: 66: ++shift_count;
    inc		[shift_count]
;divide.c: 67: divisor <<= 1;
    asl		[divisor]
    rlc		[divisor+1]
.divisor_shift_start:
    tst		[divisor+1],0b10000000
    jz		.divisor_shift_loop
    jmp		.dividend_shift_start

.dividend_shift_loop:
;divide.c: 72: {
;divide.c: 73: if [--shift_count == 0]
    dec		[shift_count]
    jnz		.l10
;divide.c: 74: {
;divide.c: 75: dividend = 0;
    mov		[dividend],0
    mov		[dividend+1],0
    mov		[dividend+2],0
    mov		[dividend+3],0
                ; dividend+4 is already zero.
    ret			; Render the output and return.

.l10:
;divide.c: 77: }
;divide.c: 78: dividend <<= 1;
    asl		[dividend]
    rlc		[dividend+1]
    rlc		[dividend+2]
    rlc		[dividend+3]
                                        ; We'll never shift anything into dividend+4 in this loop.
.begin_continue:
.dividend_shift_start:
    tst		[dividend+3],0b10000000		; See if the high bit is set.
    jz		.dividend_shift_loop		; If not, repeat until it is.
;divide.c: 79: }
;divide.c: 80: {
;divide.c: 81: unsigned char mask_count = shift_count;
    mov		[mask_count],[shift_count]	; shift_count is guaranteed to be at least 1.
;divide.c: 82: mask = 0;
    mov		[mask],0
    mov		[mask+1],0
    mov		[mask+2],0
    mov		[mask+3],0
;divide.c: 88: while [mask_count]
.mask_shift_loop:
;divide.c: 89: {
;divide.c: 90: mask = [mask << 1] | 1;
    asl		[mask]
    rlc		[mask+1]
    rlc		[mask+2]
    rlc		[mask+3]
    or		[mask],1
;divide.c: 91: --mask_count;
    dec		[mask_count]
    jnz		.mask_shift_loop
;divide.c: 92: }
;divide.c: 93: }
;divide.c: 94: do

.looping:
;divide.c: 95: {
;divide.c: 96: if [divisor <= dividend]
    mov		a,[dividend+4]			; Dividend has a higher byte than divisor.
    jnz		.dividend_is_bigger		; If dividend high byte is non-zero, dividend is bigger by definition.
    mov		a,[divisor+1]
    sub		a,[dividend+3]
    jnz		.u925
    mov		a,[divisor]
    sub		a,[dividend+2]
    jz		.dividend_is_bigger
.u925:
    jnc		.divisor_is_bigger		; They're not the same, see which one is bigger.
.dividend_is_bigger:
;divide.c: 97: {
;divide.c: 98: dividend = [[dividend - divisor] << 1] | 1;
    mov		a,[divisor]
    sub		[dividend+2],a
    mov		a,[divisor+1]
    sbb		[dividend+3],a
    sbb		[dividend+4],0
    asl		[dividend+0]
    rlc		[dividend+1]
    rlc		[dividend+2]
    rlc		[dividend+3]
    rlc		[dividend+4]
    or		[dividend+0],1
;divide.c: 99: }
    jmp		.l19

.divisor_is_bigger:
;divide.c: 100: else
;divide.c: 101: {
;divide.c: 102: dividend = [dividend << 1];
    asl		[dividend+0]
    rlc		[dividend+1]
    rlc		[dividend+2]
    rlc		[dividend+3]
    rlc		[dividend+4]

.l19:
;divide.c: 103: }
;divide.c: 104: } while [--shift_count];
    dec		[shift_count]
    jnz		.looping
.u950:

.l16:
;divide.c: 105: dividend &= mask;
    mov		a,[mask]
    and		[dividend],a
    mov		a,[mask+1]
    and		[dividend+1],a
    mov		a,[mask+2]
    and		[dividend+2],a
    mov		a,[mask+3]
    and		[dividend+3],a
    mov		[dividend+4],0
;divide.c: 106: }
    ret
;
; Copyright 2010 Jeffrey J. Nonken
; License: permission granted for unlimited use and unlimited derived use.
; If you use this, I’d appreciate it if you’d give me credit.
; Just leave these comments intact or mention me in your comments. I don’t need
; money for this, but I’d appreciate credit where it’s due.
; (Though if you want to send me money, I won’t refuse it. :)
;
; Like what I did here? Feel free to offer me a job. http://jeffrey.nonken.net/
;
; Let me know if you have any improvements. PLEASE let me know if you find bugs.

;
; This is an integer divide function, 24 bits divided by 16 bits.
; Stuff your 24-bit value into “dividend”, 16-bit value into “divisor” and call div24by16.
; Note that “dividend” is a 4-byte value, the high byte is used internally for overflow.
; Any value you place there will get squashed first thing.
; The quotient of the divide will be left in dividend, so make sure your value is safe someplace else.
; For that matter, I destroy divisor, too.
; The remainder is computed as a side-effect of the divide but we do not attempt to preserve it.
; A few lines of extra code could be written to return the remainder in “divisor”, but it will take
; some shifting around to justify the value properly. Use the inverse of “mask” on the result in
; “divisor” and the value calculated in “shift_count” to figure that one out. Don’t forget to
; include the overflow bit.
;
; I originally created this algorithm in the 1980s to work on an 8086 processor.
; Later I ported it to the Cypress M8 processor for 24 over 16. [That code is in the Times Square Ball, by the way.]
; In order to avoid some tricky stuff on the PIC 16 series processor I ported it to “C”, ran a test on a simulator,
; debugged it, then got the PICC C compiler to generate assembly code. I hand-optimized the result.
; This is what’s left.
;
; I wasn’t using the PICC C compiler to write the code for the project because it’s not well-enough integrated
; into MPLAB to do simulation or emulation. I used Microchip’s C18 compiler set to an 18F4455 MCU for the
; test/debug cycle, then used PICC C to generate the assembly.
;
; It’s not pretty, but it works.
;

AREA    bss [RAM, REL]
AREA InterruptRAM      [RAM, REL, CON]   ; Interrupts, on Page 0

quotient::                    ; Outgoing quotient is in dividend.
dividend::        BLK    5        ; Incoming dividend, outgoing quotient.
divisor::        BLK    2        ; Incoming divisor. This gets destroyed.
mask:            BLK    4        ; Mask used to separate quotient from remainder.
shift_count:    BLK    1        ; Number of shifts required to complete the operation.
;  Also number of bits in the quotient mask.
mask_count:                    ; Counter used to generate the mask.
;  [Re-use carry, we don't use both at one time.]
carry:            BLK    1        ; Holding register for the carry bit. Used in the only
;  true 3-byte operation we have.

AREA text [ROM, REL, CON]

;
; This function tries to be efficient by shifting both values to the far left to minimize the number of repititions
; through the main loop. That comes at a cost, of course, but I expect that most of the value pairs I use will do
; better this way.
;
; Dividing 1024 by 33 takes about 345 cycles, including the call/return. Just for the record.
;
; Here’s the original comment block. Note that it was written for 24×16 divide.
;———————————————————————————-
; Divide a 24-bit number by a 16-bit number. Leaves the quotient in the dividend
; [numerator] and throws away the remainder.
;
; This works like a classic long division; it shifts the numerator and denominator
; to the top of their respective words [so they are the same magnitude], counts
; the number of shifts and uses it to calculate the number of subtracts needed as
; well as the mask that eliminates the remainder.
;
; This can easily be extended or reduced to accomodate various word sizes. There
; should be no problem dividing a smaller word by a larger, but you will have
; to virtually extend the numerator with zeros. Since the initial shift factor will
; start out negative you will need to fiddle with that; you do not want to quit
; if you generate a carry, but instead test the sign of the result. [You still will
; want to jump to .zero_quotient if the shift factor is negative. Note that a shift
; factor of zero is valid.] [This version tests for zero, not carry, so that's
; no longer true. Lowest valid shift factor is 1.]
;
; If you expect to be using long words or want to macro-ize this [or expect to
; adapt it a lot] you may want to use multi-byte words and index into them
; [e.g. [numerator+1] instead of discrete names for each byte. However, the result
; will be endian-dependent. [This one is little-endian.]
;
; If you are expecting small values in the numerator or denominator, you can
; check the high byte for 0 and do 8-bit shifts before doing the final 1-bit shifts.
; Since the 8-bit-shift loop starts with a cmp [or tst] and a jump, the cost is
; only about 13 cycles if it is not necessary. [Cypress cycles.]
;
; If you need to keep the remainder it gets a bit more complicated. The calculated
; count, and the final calculated mask, will tell you where the remainder is in
; the numerator: the remainder’s LSb is just above the highest 1-bit in the mask.
; You may need to copy it out, mask and shift it before saving the quotient. You
; can also contrive to save the quotient into a different word rather than into
; the numerator, in which case you only need to shift the remainder back around.
; However, this will cost you more shifts during the calculation [you need to
; shift the numerator and the quotient separately].
;
; You can save yourself several bytes of RAM storage by calculating the final
; mask on-the-fly.
;———————————————————————————-

div24by16::
; Check for boundary conditions and set up the divide.
mov        [dividend+3],0
; Look for divide by zero.
mov        a,[divisor+1]        ; See if the divisor is zero.
or        a,[divisor]
jnz        .divisor_not_zero
; Divide by zero.
mov        a,0xFF            ; Divide by zero? Get as close to infinity as we can.
mov        [dividend],a
mov        [dividend+1],a
mov        [dividend+2],a
ret

.divisor_not_zero:
;divide.c: 62: }
;divide.c: 63: shift_count = 1;
mov        [shift_count],1+8            ; Difference in #bits between dividend and divisor. Plus 1.
;divide.c: 64: while [[divisor & 0x8000] == 0]
jmp        .divisor_shift_start

.divisor_shift_loop:
;divide.c: 65: {
;divide.c: 66: ++shift_count;
inc        [shift_count]
;divide.c: 67: divisor <<= 1;
asl        [divisor]
rlc        [divisor+1]
.divisor_shift_start:
tst        [divisor+1],0b10000000
jz        .divisor_shift_loop
jmp        .dividend_shift_start

.dividend_shift_loop:
;divide.c: 72: {
;divide.c: 73: if [--shift_count == 0]
dec        [shift_count]
jnz        .l10
;divide.c: 74: {
;divide.c: 75: dividend = 0;
mov        [dividend],0
mov        [dividend+1],0
mov        [dividend+2],0
; dividend+3 is already zero.
ret            ; Render the output and return.

.l10:
;divide.c: 77: }
;divide.c: 78: dividend <<= 1;
asl        [dividend]
rlc        [dividend+1]
rlc        [dividend+2]
; We’ll never shift anything into dividend+3 in this loop.
.begin_continue:
.dividend_shift_start:
tst        [dividend+2],0b10000000        ; See if the high bit is set.
jz        .dividend_shift_loop        ; If not, repeat until it is.
;divide.c: 79: }
;divide.c: 80: {
;divide.c: 81: unsigned char mask_count = shift_count;
mov        [mask_count],[shift_count]    ; shift_count is guaranteed to be at least 1.
;divide.c: 82: mask = 0;
mov        [mask],0
mov        [mask+1],0
mov        [mask+2],0
;divide.c: 88: while [mask_count]
.mask_shift_loop:
;divide.c: 89: {
;divide.c: 90: mask = [mask << 1] | 1;
asl        [mask]
rlc        [mask+1]
rlc        [mask+2]
or        [mask],1
;divide.c: 91: –mask_count;
dec        [mask_count]
jnz        .mask_shift_loop
;divide.c: 92: }
;divide.c: 93: }
;divide.c: 94: do

.looping:
;divide.c: 95: {
;divide.c: 96: if [divisor <= dividend]
mov        a,[dividend+3]            ; Dividend has a higher byte than divisor.
jnz        .dividend_is_bigger        ; If dividend high byte is non-zero, dividend is bigger by definition.
mov        a,[divisor+1]
sub        a,[dividend+2]
jnz        .u925
mov        a,[divisor]
sub        a,[dividend+1]
jz        .dividend_is_bigger
.u925:
jnc        .divisor_is_bigger        ; They’re not the same, see which one is bigger.
.dividend_is_bigger:
;divide.c: 97: {
;divide.c: 98: dividend = [[dividend - divisor] << 1] | 1;
mov        a,[divisor]
sub        [dividend+1],a
mov        a,[divisor+1]
sbb        [dividend+2],a
sbb        [dividend+3],0
asl        [dividend+0]
rlc        [dividend+1]
rlc        [dividend+2]
rlc        [dividend+3]
or        [dividend+0],1
;divide.c: 99: }
jmp        .l19

.divisor_is_bigger:
;divide.c: 100: else
;divide.c: 101: {
;divide.c: 102: dividend = [dividend << 1];
asl        [dividend+0]
rlc        [dividend+1]
rlc        [dividend+2]
rlc        [dividend+3]

.l19:
;divide.c: 103: }
;divide.c: 104: } while [--shift_count];
dec        [shift_count]
jnz        .looping
.u950:

.l16:
;divide.c: 105: dividend &= mask;
mov        a,[mask]
and        [dividend],a
mov        a,[mask+1]
and        [dividend+1],a
mov        a,[mask+2]
and        [dividend+2],a
mov        [dividend+3],0
;divide.c: 106: }
ret

;———————————————————————————-
; Divide a 32-bit number by a 16-bit number. Leaves the quotient in the dividend
; [numerator] and throws away the remainder.
;
; This works like a classic long division; it shifts the numerator and denominator
; to the top of their respective words [so they are the same magnitude], counts
; the number of shifts and uses it to calculate the number of subtracts needed as
; well as the mask that eliminates the remainder.
;
; This can easily be extended or reduced to accomodate various word sizes. There
; should be no problem dividing a smaller word by a larger, but you will have
; to virtually extend the numerator with zeros. Since the initial shift factor will
; start out negative you will need to fiddle with that; you do not want to quit
; if you generate a carry, but instead test the sign of the result. [You still will
; want to jump to .zero_quotient if the shift factor is negative. Note that a shift
; factor of zero is valid.] [This version tests for zero, not carry, so that's
; no longer true. Lowest valid shift factor is 1.]
;
; If you expect to be using long words or want to macro-ize this [or expect to
; adapt it a lot] you may want to use multi-byte words and index into them
; [e.g. [numerator+1] instead of discrete names for each byte. However, the result
; will be endian-dependent. [This one is little-endian.]
;
; If you are expecting small values in the numerator or denominator, you can
; check the high byte for 0 and do 8-bit shifts before doing the final 1-bit shifts.
; Since the 8-bit-shift loop starts with a cmp [or tst] and a jump, the cost is
; only about 13 cycles if it is not necessary. [Cypress cycles.]
;
; If you need to keep the remainder it gets a bit more complicated. The calculated
; count, and the final calculated mask, will tell you where the remainder is in
; the numerator: the remainder’s LSb is just above the highest 1-bit in the mask.
; You may need to copy it out, mask and shift it before saving the quotient. You
; can also contrive to save the quotient into a different word rather than into
; the numerator, in which case you only need to shift the remainder back around.
; However, this will cost you more shifts during the calculation [you need to
; shift the numerator and the quotient separately].
;
; You can save yourself several bytes of RAM storage by calculating the final
; mask on-the-fly.
;———————————————————————————-

div32by16::
; Check for boundary conditions and set up the divide.
mov        [dividend+4],0
; Look for divide by zero.
mov        a,[divisor+1]        ; See if the divisor is zero.
or        a,[divisor]
jnz        .divisor_not_zero
; Divide by zero.
mov        a,0xFF            ; Divide by zero? Get as close to infinity as we can.
mov        [dividend],a
mov        [dividend+1],a
mov        [dividend+2],a
mov        [dividend+3],a
ret

.divisor_not_zero:
;divide.c: 62: }
;divide.c: 63: shift_count = 1;
mov        [shift_count],1+16            ; Difference in #bits between dividend and divisor. Plus 1.
;divide.c: 64: while [[divisor & 0x8000] == 0]
jmp        .divisor_shift_start

.divisor_shift_loop:
;divide.c: 65: {
;divide.c: 66: ++shift_count;
inc        [shift_count]
;divide.c: 67: divisor <<= 1;
asl        [divisor]
rlc        [divisor+1]
.divisor_shift_start:
tst        [divisor+1],0b10000000
jz        .divisor_shift_loop
jmp        .dividend_shift_start

.dividend_shift_loop:
;divide.c: 72: {
;divide.c: 73: if [--shift_count == 0]
dec        [shift_count]
jnz        .l10
;divide.c: 74: {
;divide.c: 75: dividend = 0;
mov        [dividend],0
mov        [dividend+1],0
mov        [dividend+2],0
mov        [dividend+3],0
; dividend+4 is already zero.
ret            ; Render the output and return.

.l10:
;divide.c: 77: }
;divide.c: 78: dividend <<= 1;
asl        [dividend]
rlc        [dividend+1]
rlc        [dividend+2]
rlc        [dividend+3]
; We’ll never shift anything into dividend+4 in this loop.
.begin_continue:
.dividend_shift_start:
tst        [dividend+3],0b10000000        ; See if the high bit is set.
jz        .dividend_shift_loop        ; If not, repeat until it is.
;divide.c: 79: }
;divide.c: 80: {
;divide.c: 81: unsigned char mask_count = shift_count;
mov        [mask_count],[shift_count]    ; shift_count is guaranteed to be at least 1.
;divide.c: 82: mask = 0;
mov        [mask],0
mov        [mask+1],0
mov        [mask+2],0
mov        [mask+3],0
;divide.c: 88: while [mask_count]
.mask_shift_loop:
;divide.c: 89: {
;divide.c: 90: mask = [mask << 1] | 1;
asl        [mask]
rlc        [mask+1]
rlc        [mask+2]
rlc        [mask+3]
or        [mask],1
;divide.c: 91: –mask_count;
dec        [mask_count]
jnz        .mask_shift_loop
;divide.c: 92: }
;divide.c: 93: }
;divide.c: 94: do

.looping:
;divide.c: 95: {
;divide.c: 96: if [divisor <= dividend]
mov        a,[dividend+3]            ; Dividend has a higher byte than divisor.
jnz        .dividend_is_bigger        ; If dividend high byte is non-zero, dividend is bigger by definition.
mov        a,[divisor+1]
sub        a,[dividend+3]
jnz        .u925
mov        a,[divisor]
sub        a,[dividend+2]
jz        .dividend_is_bigger
.u925:
jnc        .divisor_is_bigger        ; They’re not the same, see which one is bigger.
.dividend_is_bigger:
;divide.c: 97: {
;divide.c: 98: dividend = [[dividend - divisor] << 1] | 1;
mov        a,[divisor]
sub        [dividend+2],a
mov        a,[divisor+1]
sbb        [dividend+3],a
sbb        [dividend+4],0
asl        [dividend+0]
rlc        [dividend+1]
rlc        [dividend+2]
rlc        [dividend+3]
rlc        [dividend+4]
or        [dividend+0],1
;divide.c: 99: }
jmp        .l19

.divisor_is_bigger:
;divide.c: 100: else
;divide.c: 101: {
;divide.c: 102: dividend = [dividend << 1];
asl        [dividend+0]
rlc        [dividend+1]
rlc        [dividend+2]
rlc        [dividend+3]
rlc        [dividend+4]

.l19:
;divide.c: 103: }
;divide.c: 104: } while [--shift_count];
dec        [shift_count]
jnz        .looping
.u950:

.l16:
;divide.c: 105: dividend &= mask;
mov        a,[mask]
and        [dividend],a
mov        a,[mask+1]
and        [dividend+1],a
mov        a,[mask+2]
and        [dividend+2],a
mov        a,[mask+3]
and        [dividend+3],a
mov        [dividend+4],0
;divide.c: 106: }
ret

More About Optimization

February 20th, 2010

Thanks for all the nice comments about my last post. It’s really gratifying to get so much praise so early.

I wanted to add a few thoughts. Kermit Derren says:

I was studying something else about this on another blog… Your position on it is diametrically contradicted to what I read earlier.

He’s not criticizing me, I left out some context, go back and read the whole comment. For that matter, neither am I criticizing him. It’s just that his comment made me think a little. I can’t comment directly on the other blog, not having read it (he didn’t provide a link). I want to emphasize that it’s possible that the other blogger is right, and that I’m also right, because what your priorities are depends very much on the situation.

My aforementioned former colleague who saved two bytes wasn’t so much misguided as he was just blindly following what he’d learned early in his career. He used to work in an industry where the resources were so thin and the timing so critical that the people who wrote the code knew every detail about every operation by heart, including all affected flags, any side-effects, and how many cycles they took to run. If there were any undocumented features they doubtless knew about those as well and used them. To these people finding two extra bytes was like discovering a new seam in what they thought was a played-out mine.

I’ve never lived in that world myself, though I’ve been closer than I am now. When you’re working in sub-1K memory space your priorities are different, believe me. So while I don’t know what the blog entry actually said, I’m willing to believe that an apparent disagreement might be due to a contextual difference. Then again, maybe he really believes that saving resources is a higher priority than writing clear, maintainable code in all circumstances. Until and unless Kermit shares the link I’m reduced to guesses.

Now… since I wrote “Optimize This!” I’ve watched my own habits more closely. I’ve found myself doing the very thing I’ve criticized — re-using variables with more of a thought to saving RAM (that I usually don’t need to save) than to program clarity. When I catch myself doing that of course I immediately repent and fix the problem. Hopefully I’ll break that habit soon and instill some better habits.

We are all of us hypocrites to some degree, and I am no exception, but I like to try not to be when I can help it.

A few days ago I had a programming situation where I had three conditions, each of which would cause a behavior (send a message) but were mutually exclusive (I can only send one at a time). I assigned each flag as a bit in a byte, with the higher priority as the higher value bit. Working in assembly there are a couple ways to do this:

  • An if-then-else that tests each bit and decides whether to execute the associated code.
  • A jump table that depends on the value of the byte. It will of course double in size each time I add a bit.

For only three flags I had eight values, so I made it a jump table. Jump tables are pretty fast. Then I discovered two more conditions, which meant that now I had thirty-two possible combinations that started to make the jump table look a bit daunting to create and maintain.

My first reaction was to make two jump tables and therefore two decisions, to keep the ROM usage down. Then I remembered my post and stopped to actually evaluate my situation. Once I thought it through the answer was obvious. I was using about 1/4 of my total ROM so far in a project that was mostly written. 48 extra bytes (two per jump) wasn’t going to make a dent in that, it was easier to maintain a single table than two tables plus glue logic, and it would take exactly the same number of execution cycles to deal with a 32-jump table as an 8-jump table. Instead of, say, three times as many cycles, if not more, if both tables were executed. As far as generating the table goes, well, I put it together using one of my favorite tools — Excel (or an open source alternative — usually Open Office or Gnumeric). This made generating redundant entries (and counting them) and re-arranging priorities much simpler to do without making mistakes (once I had the equations fine-tuned, of course).

Of course writing the Excel equations can take more time than just writing the bloody thing by hand, but it’s less tedious and, as I mentioned, less prone to making mistakes — especially when I change things around, because if I design the spreadsheet well enough, I can change things around and the table just automatically gets fixed.

In any case, I was much happier with the result.

I really like using a spreadsheet for generating (and sometimes maintaining) lists of things. You can, for example, generate the lookup table for a curve, including DB (Define Byte) statements and tabs and comments if need be, in Excel and then copy/paste the result directly into your source code. If you make a change it just re-calculates and you just copy/paste again. Excel has some nice table lookup functions too, string manipulation, and decimal/hex conversion. You don’t really need to be an expert — learn a few tricks and you’re halfway there. I depend a lot on the built-in help and on web searches when I’m trying to learn a new feature, but none of it is difficult. Read this article by Joel Spolsky called “The Process of Designing a Product”. About halfway down he talks about discovering that people mostly use Excel to keep lists. It’s certainly true in my case, and I’m glad they focused on that functionality as a result.

ℼ佄呃偙⁅瑨汭倠䉕䥌⁃ⴢ⼯㍗⽃䐯䑔堠呈䱍ㄠ〮吠慲獮瑩潩慮⽬䔯≎ऊ∉瑨灴⼺眯睷眮⸳牯⽧剔砯瑨汭⼱呄⽄桸浴ㅬ琭慲獮瑩潩慮⹬瑤≤ਾ格浴浸湬㵳栢瑴㩰⼯睷⹷㍷漮杲ㄯ㤹⼹桸浴≬砠汭氺湡㵧攢≮氠湡㵧攢≮ਾ格慥㹤ऊ琼瑩敬吾扡敬㱳琯瑩敬ਾ洼瑥⁡瑨灴攭畱癩∽潃瑮湥⵴祔数•潣瑮湥㵴琢硥⽴瑨汭※档牡敳㵴瑵ⵦ∸⼠ਾ㰉ⴡ‭䜢偟啌䥇彎但归呈䱍•ⴭਾ猼祴敬琠灹㵥琢硥⽴獣≳ਾ瑴笠ऊ潦瑮昭浡汩㩹挠畯楲牥਻੽摴笠ऊ潦瑮昭浡汩㩹栠汥敶楴慣‬慳獮猭牥晩਻੽慣瑰潩੻昉湯⵴慦業祬›敨癬瑥捩ⱡ猠湡⵳敳楲㭦ऊ潦瑮猭穩㩥ㄠ瀴㭴ऊ整瑸愭楬湧›敬瑦਻੽⼼瑳汹㹥㰊栯慥㹤㰊潢祤ਾ瀼⼠㰾慴汢⁥散汬灳捡湩㵧〢•散汬慰摤湩㵧㌢㸢㰊牴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㰰琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢潮愠瑣潩㱮琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢敮瑸獟整㱰琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻〾⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻渾硥彴瑳灥⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㬾丠瑯楨杮琠潤㰡琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㰱琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢䕒佐呒偟协呉佉㱎琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢献湥彤潰楳楴湯⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㰱琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢献湥彤潰楳楴湯⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㈾⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻刾偅剏彔䕓噒彏䥔䕍⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻⸾敳摮獟牥潶瑟浩㱥琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㈾⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻⸾敳摮獟牥潶瑟浩㱥琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㰴琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢䕒佐呒䵟呏剏呟䵉㱅琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢献湥彤潭潴彲楴敭⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㰳琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢献湥彤敳癲彯楴敭⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㠾⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻刾偅剏彔䥌䥍㱔琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢献湥彤楬業㱴琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㐾⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻⸾敳摮浟瑯牯瑟浩㱥琯㹤㰊摴挠汯灳湡∽∴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢※敓摮洠瑯牯琠畲灭⁳敳摮猠牥潶㰮琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㘱⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻䰾十彔䅖啌彅䕐䑎义㱇琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢敮瑸獟整㱰琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㔾⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻⸾敳摮浟瑯牯瑟浩㱥琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㰶琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢献湥彤潭潴彲楴敭⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㜾⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻⸾敳摮浟瑯牯瑟浩㱥琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㰸琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢献湥彤楬業㱴琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴挠汯灳湡∽∶†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢※敓摮氠浩瑩琠畲灭⁳敳摮瀠獯瑩潩牯洠瑯牯猠数摥⹳⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㤾⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻⸾敳摮江浩瑩⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻ㄾ㰰琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢献湥彤楬業㱴琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢ㄱ⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻⸾敳摮江浩瑩⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻ㄾ㰲琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢献湥彤楬業㱴琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㌱⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻⸾敳摮江浩瑩⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻ㄾ㰴琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢献湥彤楬業㱴琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㔱⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻⸾敳摮江浩瑩⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻ㄾ㰶琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢敮瑸獟整㱰琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴挠汯灳湡∽∴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢※慗瑩湩⁧潦⁲牰癥潩獵瘠污敵琠敳摮㰮琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㜱⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻渾硥彴瑳灥⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻ㄾ㰸琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢敮瑸獟整㱰琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㤱⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻渾硥彴瑳灥⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㈾㰰琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢敮瑸獟整㱰琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢ㄲ⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻渾硥彴瑳灥⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㈾㰲琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢敮瑸獟整㱰琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㌲⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻渾硥彴瑳灥⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㈾㰴琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢敮瑸獟整㱰琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㔲⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻渾硥彴瑳灥⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㈾㰶琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢敮瑸獟整㱰琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㜲⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻渾硥彴瑳灥⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㈾㰸琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢敮瑸獟整㱰琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢㤲⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻渾硥彴瑳灥⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ琼㹲㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽楲桧≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻㌾㰰琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢浪㱰琯㹤㰊摴挠汯灳湡∽∲†慶楬湧∽潢瑴浯•愠楬湧∽敬瑦•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢敮瑸獟整㱰琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊摴†瑳汹㵥∢㰾琯㹤㰊琯㹲㰊牴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮爢杩瑨•猠祴敬∽映湯⵴楳敺ㄺ瀱㭴㸢ㄳ⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻樾灭⼼摴ਾ琼⁤潣獬慰㵮㈢•瘠污杩㵮戢瑯潴≭†污杩㵮氢晥≴†瑳汹㵥•潦瑮猭穩㩥ㄱ瑰∻渾硥彴瑳灥⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ琼⁤猠祴敬∽㸢⼼摴ਾ⼼牴ਾ⼼慴汢㹥㰊戯摯㹹㰊栯浴㹬 ഀ䫄൵䥴൵�൱而൰�൱�൱�൱而൰ȴ൲Ɍ൲�൱而൰�൱�൱�൱而൰൱൱�൱而൰�൱�൱�൱而൰ผ൲ิ൲�൱而൰�൱�൱�൱而൰൱劼൲�൱而൰䫜൵䫴൵䬌൵䬤൵䬼൵䭔൵䭬൵䮄൵䮜൵䮴൵䯌൵䯤൵䯼൵䰔൵䰬൵䱄൵䱜൵䱴൵䲌൵�൱而൰�൱�൱�൱而൰൱≴൲�൱而൰�൱�൱�൱而൰൱㔄൲�൱而൰�൱�൱�൱而൰�൱处൲�൱而൰�൱�൱�൱而൰൱൱�൱而൰�൱�൱�൱而൰൱妔൲�൱而൰�൱�൱�൱而൰�൱䴄൲�൱而൰�൱�൱�൱而൰�൱㘼൲�൱而൰�൱�൱�൱而൰�൱⢼൲�൱而൰�൱�൱�൱而൰�൱Ẵ൲�൱而൰�൱�൱�൱而൰�൱ු൲�൱而൰�൱�൱�൱而൰൱ﻔ൱�൱而൰�൱�൱�൱而൰�൱൱�൱而൰�൱�൱�൱而൰�൱⟌൲�൱而൰�൱�൱�൱而൰൱௜൲�൱而൰�൱�൱�൱而൰൱ﵬ൱�൱而൰�൱�൱�൱而൰൱൱�൱而൰�൱�൱�൱而൰�൱൱�൱而൰�൱�൱�൱而൰�൱�൱䲤൵�൱䲼൵䳔൵�൱䲼൵�൱而൰�൱�൱�൱而൰�൱൱䳬൵�൱䴄൵൱而൰�൱�൱൱而൰�൱䯌൲�൱而൰�൱�൱�൱尼൰൱�൱而൰�൱�൱�൱而൰˴൲಴൲�൱而൰�൱�൱�൱而൰൱൱�൱而൰�൱�൱�൱而൰ᢴ൲⭴൲�൱而൰�൱�൱�൱而൰᱄൲ᱜ൲�൱而൰�൱�൱�൱而൰巤൲巼൲�൱而൰�൱�൱�൱而൰Ӕ൲Ӭ൲�൱而൰�൱�൱�൱而൰幜൲年൲�൱而൰�൱�൱�൱而൰弜൲弴൲�൱而൰�൱�൱�൱而൰䮄൲䮜൲�൱而൰�൱�൱�൱而൰൱൱�൱而൰�൱�൱�൱而൰“൲‴൲�൱而൰�൱�൱�൱而൰Д൲Ь൲�൱而൰�൱�൱�൱而൰ൄ൲൜൲൱而൰�൱�൱൱而൰൱൱�൱而൰�൱�൱�൱而൰庌൲庤൲⺼൵而൰൱而൰�൱�൱൱而൰�൱㥔൲�൱而൰�൱�൱�൱而൰㶤൲㶼൲�൱而൰�൱�൱�൱而൰公൲�൱而൰�൱�൱�൱而൰☄൲☜൲�൱而൰�൱�൱�൱而൰Ỽ൲ἔ൲�൱而൰�൱�൱�൱而൰᠌൲ᠤ൲�൱而൰�൱�൱�൱而൰൱൱�൱而൰�൱�൱�൱而൰൱൱而൰垔൱垔൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱�൱㚼൹㛔൹㛬൹㜄൹㜜൹㜴൹㝌൹㝤൹㝼൹㞔൹㞬൹ᴌ൹쩜൸�൱而൰牼൷�൱而൰皴൷盌൷尼൰䒬൲盤൷盼൷錔൶ޜൺ޴ൺ�൱而൰붔൴붔൴而൰�൱�൱ᙄ൲而൰붬൴뷄൴뷜൴뷴൴브൴븤൴븼൴빔൴빬൴뺄൴뺜൴뺴൴뻌൴뻤൴뻼൴뼔൴뼬൴뽄൴뽜൴뽴൴뾌൴뾤൴뾼൴뿔൴뿬൴쀄൴쀜൴쀴൴쁌൴쁤൴쁼൴삔൴사൴샄൴샜൴�൱㒔൰㒔൰噜൱噜൱�൱而൰ߌൺ�൹䥴൵ߤൺ߼ൺ�൱ꃜ൶疔൷疔൷൷㟄൹㟜൹㟴൹㠌൹㠤൹㠼൹㡔൹㡬൹㢄൹㢜൹⺼൵而൰�൱而൰䒬൲ᅣ൷๬൳�൱而൰嗜൰൷ل൳�൱而൰暌൰൷Ӝ൳�൱而൰璄൰൷ӄ൳�൱而൰嘼൰൷Ҭ൳�൱而൰蘤൰൷Ҕ൳�൱而൰䏴൰൷Ѽ൳�൱而൰㏔൰൷Ѥ൳�൱而൰叼൰൷ь൳�൱而൰獤൰൷д൳�൱而൰䯬൰൷М൳�൱而൰勴൰൷Є൳�൱而൰乄൰൷Ϭ൳�൱而൰膤൰൷ϔ൳�൱而൰㪔൰൷μ൳�൱而൰佤൰൷Τ൳൷�൱而൰佤൰൷Ό൳�൱而൰佤൰൷ʹ൳�൱而൰犌൰൷͜൳�൱而൰涬൰൷̈́൳�൱而൰䬔൰൷̬൳൷�൱而൰禔൰൷̔൳�൱而൰禔൰൷˼൳�൱而൰琤൰൷ˤ൳�൱而൰袬൰൷ˌ൳�൱而൰姤൰൷ʴ൳�൱而൰搴൰൷ʜ൳൷�൱而൰搴൰൷ʄ൳�൱而൰搴൰൷ɬ൳�൱而൰苄൰൷ɔ൳�൱而൰䐌൰൷ȼ൳�൱而൰誤൰൷Ȥ൳�൱而൰舄൰൷Ȍ൳�൱而൰抴൰൷Ǵ൳�൱而൰牄൰൷ǜ൳�൱而൰歔൰൷DŽ൳�൱而൰㺜൰൷Ƭ൳�൱而൰煬൰൷Ɣ൳�൱而൰㗌൰൷ż൳�൱而൰㒔൰൷Ť൳�൱而൰脜൷൷Ō൳�൱而൰脜൷൷Ĵ൳�൱而൰贬൰൷Ĝ൳�൱而൰㒔൰൷Ą൳�൱而൰㼌൲㹴൷㺌൷㺤൷㺼൷㻔൷㻬൷㼄൷㼜൷㼴൷㽌൷㽤൷㽼൷㾔൷㾬൷㿄൷㿜൷㿴൷䀌൷䀤൷䀼൷䁔൷䁬൷㸔൷㺔൲㸔൷㺔൲㸔൷㺔൲㸔൷㺔൲㸔൷㺔൲㸔൷㺔൲㸔൷㺔൲㸔൷㺔൲㸔൷㺔൲㸔൷㺔൲ぼ൷�൱㻄൲つ൷�൱㻄൲㸬൷�൱㻄൲ﴄ൶�൱㻄൲㻄൲溔൲陴൰�൱�൱㻄൲隤൰䂌൲䂤൲䂼൲溬൲䏔൲䥜൲䓄൲䥴൲䦤൲滄൲坌൱�൱而൰壼൱�൱㻄൲�൱㻄൲�൱㻄൲�൱㻄൲�൱㻄൲�൱㻄൲�൱㻄൲�൱㻄൲�൱㻄൲�൱㻄൲ﴜ൶�൱㻄൲㹄൷�൱㻄൲㹜൷�൱㻄൲ゔ൷�൱㻄൲�൱㻜൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱㼌൲�൱而൰㼤൲䂄൷䂜൷䂴൷䃌൷䃤൷䃼൷䄔൷䄬൷䅄൷䅜൷䅴൷䆌൷䆤൷䆼൷䇔൷䇬൷䈄൷䈜൷䈴൷䉌൷䉤൷䉼൷䊔൷䊬൷䋄൷䋜൷䋴൷䌌൷䌤൷䌼൷䍔൷䍬൷䎄൷䎜൷䎴൷䏌൷䏤൷䏼൷䐔൷䐬൷䑄൷䑜൷䑴൷䒌൷䒤൷䒼൷䓔൷�൱㼤൲�൱㼤൲�൱㼤൲�൱而൰㼼൲䓬൷䔄൷䔜൷䔴൷䕌൷䕤൷䕼൷䖔൷�൱而൰㽔൲䖬൷䗄൷䗜൷䗴൷䘌൷䘤൷猀Փ铘=�ԪṀԎ倴਱싔੏值਱뵴Խ獀Փ铘=�ԪṀԎ偌਱殤̃偔਱분Խ玀Փ铘=�ԪṀԎ偤਱鈄©偬਱붤Խ酀Չ齰=뻠Ў䤠ਦ袉ҹ́й̱й

Look to the Edges

December 16th, 2009

“And what do you really do?” said Tiffany.

The thin witch hesitated for a moment, and then:

“We look to… the edges,” said Mistress Weatherwax. “There’re a lot of edges, more than people know. Between life and death, this world and the next, night and day, right and wrong… an’ they need watchin’. We watch ‘em, we guard the sum of things.”

We programmers need to watch the edges, too. Different edges, perhaps, but nevertheless there’re a lot of edges, more than people know.

Back in the ’90s I was administrating a backup system with about 180 client computers. One day it started going wonky. (I don’t remember in what fashion.) I couldn’t figure out why, so naturally I called tech support.

After I described the problem, the support tech said, “What did you change?” I didn’t change anything. “You must have changed something, it was working before and it’s not working now. What have you changed in the last few days?” Certainly the code didn’t change, and it obviously worked, so the fault must lie in the customer. He was very insistent and barely polite.

We did not solve the problem during that phone call.

Two days later an upgrade came out. It seems that when the catalog file grew over 2GB it started having problems. The upgrade fixed that problem. (I don’t remember, but I’ll bet it extended it to 4GB. If you run the numbers I think you’ll find that 2GB is exactly the same as the highest value you can fit into a signed 32-bit integer. What a coincidence!)

What did I change in those few days? Depends on your point of view; all I changed was the amount of data in the catalog, and that was by running the system exactly as I had been. Really I didn’t change anything, but things did change, and the program simply couldn’t keep up.

It’s a perfect illustration of edge conditions that we all should watch. Can the index overflow a word? Will the size of that value really, really work for all eternity, or just for the foreseeable future? Watch out for statements like “Nobody will EVER fill that up!” I heard that in 1982 about 5 megabyte hard drives. Nobody will ever fill that up. Heh. Now you can buy 5 terabytes for a few hundred dollars and carry it around in a couple decent-sized coat pockets. It won’t be long before you can buy it in a single drive for under $100. And nobody with any sense will tell you that you’ll never fill it up.

One very typical programming error is called an off-by-one error, which usually happens when you’re working with zero-based indexes and one-based sizes and lose track of which is which. You can consider this a kind of edge condition; it certainly causes edge-condition-type errors. You can easily copy one too few or one too many items, or simply index past the end of your data. By one. Or never quite reach it. By one.

It’s easy to find those edges in small embedded systems where you typically use as small a variable as you can get away with. I use single-byte words all the time. How often do I use 32-bit integers? Pretty gosh darn seldom. But 32-bit signed integers are the default for the C++ compiler I use for PC programming. It’s actually more efficient to let it default to 32-bit integers than to use bytes, because the native word on the Pentium-class processor is 32 bits. Using anything smaller just makes the system work harder, and it will probably allocate 32 bits anyway because it wants things to align nicely. You probably have to tell it to pack your structs if you want to use them outside the program.

It’s enough to make an 8-bit processor guy cry (for envy) into his special sheep liniment.

What really tends to sneak up to you, though, and bite you on the backside is when you convert or calculate or otherwise change from one word size to another. This happens often when you try to have two different sized systems talking to each other, though of course it can also happen within your small system. It also happens when you get sloppy programmers writing code that uses the minimum needed (but different) size for each variable and then willy-nilly does calculations and copies and indexes back-and-forth without even bothering to cast anything. Of course this guy uses a low warning level when he compiles. And then he leaves the job and you’re left trying to figure out why the code is so buggy. Or rather, why it’s buggy is obvious, but you have to find and kill the bugs.

My current job is all about PC-based flight simulator software exchanging data with simulator hardware that’s based on 8-bit micros. Fortunately most of the actual data are flowing from the hardware to the flight sim software; most of the time the data are going from 10 bits to floating point, and the probability of messing up the scaling is much, much higher than the possibility of losing your high bits.

On the other hand, some of the controls don’t use the full range of the sensors and so must be calibrated. Since the sensor in this case is probably a cheap potentiometer, the wires pick up noise, there’s noise on the power supply, noise introduced by the multiplexers, and noise in the A/D conversion, the value tends to jump around a little. So even the most careful calibration won’t guarantee that the incoming value will never exceed my calibrated limits unless I’m really fanatical or add some padding. Which I don’t really want to do  in this case.

What’s that all mean? It means the incoming data will probably exceed my edges, eventually. In this case I simply check the value against the end-points, and if it’s out of range I rail it.

But I do check.

“But there’s magic, too. You’ll pick that up. It don’t take much intelligence, otherwise wizards wouldn’t be able to do it.”

It’s not enough to be a wizard. You need to be a witch, and look to the edges.


Quotes and the reference to Special Sheep Liniment are from The Wee Free Men, a novel of the Discworld by Terry Pratchett.