lapisnev

Don't squeeze me, I fart

Things that make you go 🤌. Weird computer stuff. Artist and general creative type. Occasionally funny. Gentoo on main. I play rhythm games!

Inkscape Monofur font Cohost PLUS!

You can post SVG files like photos on this website! Spread the word!


DecayWTF
@DecayWTF

you motherfuckers keep talking about electrical arcs and I'm gonna make a big effortpost about Maxwell's equations and that is a threat


DecayWTF
@DecayWTF

First things first, let's get this out of the way:

What we understand as Maxwell's equations were not developed by James Clerk Maxwell.


(Roughly) starting in the late eighteenth century with Charles-Augustin de Coulomb's development of Coulomb's Law, which mathematically describes the force between two electrostatic charges, a lot of work was done by various scientists to understand and formalize the science of the study of electrical and magnetic forces. This work was all pretty scattershot, however, and was not built into a single, shared framework that could be used in a general way, a theory of electrostatics that we would understand.

Starting in the 1850s, James Maxwell set out to unify the various experimental and theoretical results that had been found in this field, and by analogy with some mechanical models successfully created a unified theory of electromagnetism, which was eventually published, with the set of equations describing them comprising part III, in his treatise A Dynamical Theory of the Electromagnetic Field.

Those are not Maxwell's Equations! This was great, powerful work, and his formalism of twenty equations in twenty unknowns (directly reducible to eight equations by employing vectors) was a huge advance, and in particular unified light into the theory of electromagnetism.

What we know as Maxwell's Equations, however, is not this system of equations! The modern equations, the system of four equations in two unknowns (or four depending on how you think of the integration surfaces), was developed by Oliver Heaviside, who is easily one of the most underrespected engineers and scientists of his era. He was kind of a weird, irascible guy with some shitty ideas and even shittier behaviors, especially toward the end of his life (that can happen when your father is an abusive shithead even by the standards of Victorian England and you learn from that, I guess) but among other things:

  • Did this thing we're talking about
  • Developed the operational calculus, allowing some (in principle, any) linear differential equations to be solved as purely algebraic equations
  • Developed much of the modern vector calculus, the mathematical formalism he used to create the modern Maxwell's Equations and that underpinned a significant amount of early work on quantum mechanics
  • Developed the transmission line model and the set of telegraph equations still used daily for analyzing the behavior of electrical transmission lines
  • Invented the coaxial cable
  • Mathematically predicted the existence of the ionosphere and its effect on radio propagation
  • Published a full derivation of the magnetic component of the Lorentz force a few years before Henrik Lorentz derived the full theory
  • Like a hundred other fucking things

So we're going to talk about the Maxwell-Heaviside Equations and if I ever see you disrespecting Oliver Heaviside's work I am coming to your house.

The equations are a mathematical formalism describing the entirety of classical electromagnetism; that means they are the classical limit of quantum electrodynamics, or what holds at physical scales above the quantum scale, ie ours. These are the foundation of electrical engineering; most electrical work and electronics is still based on these equations until you get down to the level of modern nanoscale field-effect transistors where quantum effects actually do come into play.

Each equation comes in two forms, an integral form and a differential form. This is usual calculus stuff as integration and differentiation are inverse functions, or in other words:

f(x)=ddxf(x)Derivative of f(x)
f(x)+C=f(x)𝑑xIntegrating the derivative gives you f(x) up to a constant factor
f(x)=ddxf(x)𝑑xTaking the derivative of the indefinite integral gives you f(x)

Okay cool we are like... not even on our way really. We're Muerte at the Waffle House. But okay.

The four equations are as follows. You are not expected to understand this yet:

Integral equationsDifferential equations
Gauss's lawΩ𝐄d𝐒=1ε0ΩρdV𝐄=ρε0
Gauss's law for magnetismΩ𝐁d𝐒=0𝐁=0
Maxwell-Faraday equation
Faraday's law of induction
Σ𝐄d=ddtΣ𝐁d𝐒×𝐄=𝐁t
Ampère's circuital law (with Maxwell's addition)Σ𝐁d=μ0(Σ𝐉d𝐒+ε0ddtΣ𝐄d𝐒)×𝐁=μ0(𝐉+ε0𝐄t)

My brother in Satan, what the fuck is that notation?

Okay! First off, we're stealing some conventions from wikipedia because they had some nicely laid-out LaTeX already and I'm lazy. So, as taken from the Wikipedia page on the Maxwell-Heaviside Goddamn Equations which is a sadly dogshit resource in terms of trying to learn what the hell they mean:

Symbols in bold represent vector quantities, and symbols in italics represent scalar quantities, unless otherwise indicated. The equations introduce the electric field, E, a vector field, and the magnetic field, B, a pseudovector field, each generally having a time and location dependence. The sources are

  • the total electric charge density (total charge per unit volume), ρ, and
  • the total electric current density (total current per unit area), J.

The universal constants appearing in the equations are:

Let's go through these one at a time.

Basic vector math. If you're already familiar with vector arithmetic, you can skip this.

This is going to be a short introduction to vector math and the operations required to discuss the equations themselves. It's just a crash course, so if you want to learn more about the subject, LibreTexts hosts an excellent OpenStax textbook on the subject of calculus which includes extensive material on vector mathematics.

A vector is a mathematical object that represents both a magnitude and a direction. In the simplest sense, you can think of a set of coordinates as a vector; for example, in the standard Cartesian coordinate plane, the point (3, 4) could represent a two-dimensional vector with length 5 (the magnitude) and angle (the direction) of ~0.928 radians.

Graph of the vector from 0,0 to 3,4

The length of any vector is calculated, based on Pythagoras, as the square root of the sum of the squares of its component values. For a vector v, we call this the magnitude or the modulus of v, or sometimes the norm, and it is represented as |v|:

𝐯=v1,v2,v3|𝐯|=v12+v22+v32

For example, for our vector (3, 4) above, we would calculate the length like:

𝐯=3,4|𝐯|=32+42=5

The direction of a two-dimensional vector is expressible as an angle counterclockwise from the x axis, typically expressed in radians. This angle is the direction of the vector, and sometimes called the argument of the vector, especially when treating complex numbers as vectors. Given a vector with two components, x and y, the direction θ is given by the inverse tangent of y/x:

𝐯=v1,v2θ=arctanv2v1

In three dimensions, the direction of a vector is a much more complicated question, and we don't usually try to treat it as an independent quantity; we only really use angles between vectors, and in three dimensions it's not necessarily easy to decide what the angle of a vector in its own right should be measured against. Instead, we deal with what are called unit vectors, a vector that points the same direction as the vector whose direction we are interested in but has a length of exactly 1. For a vector v, the unit vector is called v^, pronounced "v-hat", and no, I assure you, that's not a joke. What the fuck is going on with mathematicians.

We can obtain a unit vector by dividing a vector by its own length:

𝐯=v1,v2,v3𝐯^=v1|𝐯|,v2|𝐯|,v3|𝐯|

Before we go any further, let's talk about notation a little bit. First, a common way to represent vector quantities is to write an arrow above the variable name. We saw this earlier with the function F. I'm only writing functions this way but it is conventional to write vector variables this way, especially in cases where boldface is not available or not easily distinguishable from regular text. Second, there are a number of equivalent ways to express a vector quantity. All of the following express precisely the same vector:

𝐯=v1,v2,v3𝐯=(v1v2v3)𝐯=v1𝐢^+v2𝐣^+v3𝐤^

In the third style, 𝐢^, 𝐣^ and 𝐤^ represent the standard basis vectors, unit vectors directly along the x, y and z axes, respectively:

𝐢^=1,0,0𝐣^=0,1,0𝐤^=0,0,1

All of these are valid and have their uses; the columnar form is usually seen more in pure math contexts but is convenient when doing matrix math. We'll be seeing all three, so don't get confused.

Next, we're going to talk about operations you can do on vectors, starting with some basic arithmetic. First, multiplication is the main well-defined operation between a vector and a scalar. Given a vector v and a scalar x:

x𝐯=xv1,xv2,xv3𝐯x=v1x,v2x,v3x

Simple stuff, you just multiply each component by x, or divide through in the same way (which is really just multiplying by 1/x). Note that adding a vector and a scalar, subtracting one from the other or dividing a scalar by a vector are all undefined because they're meaningless operations; a vector has a direction and a scalar doesn't, so what could those operations even mean? Multiplication, on the other hand, can be pretty clearly interpreted as scaling the length of the vector while leaving its direction unaltered (which should also make it very clear how you get a unit vector by taking v/|v|).

Now, vector-vector operations! Now we're getting into the fun stuff. First, addition and subtraction. Given two vectors, u and v:

𝐮+𝐯=u1+v1,u2+v2,u3+v3𝐮𝐯=u1v1,u2v2,u3v3

Again, just add or subtract the components, easy. The geometric interpretation of this is pretty interesting though. Let's say we have two 2d vectors, (3,4) and (1, -2), which give (4,2) when added:

Adding (3,4) and (1,-2) to get (4, 2

Adding or subtracting two vectors gives a third vector that points to the location that's the sum or difference of the two original vectors.

Now, multiplication. Here, uh... Leave your phone on the table and let's go out to the car for a minute.

...

Alright, I'll be straight: There's no "multiplication" operation between two vectors. In fact, there's two different operations you can do that are like multiplication, but neither of them is straight-up multiplication. For that, you want to be dealing in complex mathematics and you get that nice rotational behavior, but that's only for two-dimensional vectors. Here in general vector space land we have to make do with the dot product and the cross product.

First, the dot product. Given two vectors, u and v, the dot product can be determined as follows:

𝐮𝐯=u1v1+u2v2+u3v3

It's the sum of the products of each term, and gives a scalar result. For instance, the dot product of (3,4) and (1,-2):

3,41,2=31+42=38=5

The dot product of two vectors can be defined as follows:
𝐮𝐯=|𝐮||𝐯|cosθ

In practice, what this means is that the dot product encodes the angle θ between two vectors, which can be obtained as such:

θ=arccos𝐮𝐯|𝐮||𝐯|

Now, we have the cross product, also called the vector product. The cross product of two vectors u and v, denoted by u×v, encodes the angle between the two vectors and their lengths, much like the dot product, but also a unit vector perpendicular to the two original vectors, called the normal vector. If you're familiar with 3D graphics, you'll know about the use of normal vectors for calculating incidence of light, texture mapping and myriad other applications, and here it has a number of similar applications.

There's a few different ways to calculate the cross product, depending if you're using formal linear algebra or not, but a simple method is:

𝐮×𝐯=(u2v3u3v2u3v1u1v3u1v2u2v1)

For instance, the cross product between the vectors (3,4,5) and (1,-2,0) (and please note that the cross product is specific to three-dimensional vectors; there is a related concept called the exterior product for other types of vectors, and there's also a seven dimensional cross product for octonions, but that's well beyond what we're interested in here):

(345)×(120)=(405251303241)=(10510)

Similarly to the dot product, the cross product has a trigonometric definition:

𝐮×𝐯=|𝐮||𝐯|sin(θ)𝐧^

Where θ is the angle between u and v, just as in the dot product, and n^ is the normal vector described above. Note that the direction of the normal vector follows what is called the right-hand rule; if you point your index finger along the direction of the first vector, u, then turn your hand so that you can point your middle finger along the direction of the vector v, your thumb will point in the direction of n^. Seen here (image taken from Wikipedia's article on the right-hand rule, which explains it in more depth):

Diagram of the right-hand rule

And that's our basic vector operations. All of our other vector operations, in particular the vector calculus we'll be talking about later, builds off these operations.

Basic calculus. If you're know the basics of differentiation, integration and multivariate calculus, you can skip this too.

Again, this is just going to be a quick crash course. LibreTexts hosts an excellent OpenStax textbook on calculus. MIT also has excellent OpenCourseware courses on single-variable calculus and (multivariate and vector calculus)[https://ocw.mit.edu/courses/18-02-multivariable-calculus-fall-2007/].

Okay: Calculus is the study of change. How the behavior of functions change across their domain, how functions change as they approach their limits, how series of sums or products change as they approach infinite numbers of terms, and the like. Physics uses calculus heavily because almost all physics is interested, in some way or another, in change over time.

At the bottom of calculus we have this idea of a limit. Instead of necessarily being concerned with the result of a function at a particular point, f(x) where x is some value, the limit of a function is how it behaves as it approaches some value. As a basic example, let's think about the classic problem of division by zero. If you have a function f(x)=1x, it is pretty well-known that you can't just plug a zero in for x and get anything back, since division by zero is undefined and, in fact, is a logical dead end since it introduces a logical contradiction; if you've ever seen bogus "proofs" that purport to show something like 2=1, the usual way that's accomplished is by doing a hidden division by zero somewhere, introducing a contradiction which allows any statement to be "proved". However! At any non-zero value of x, even those very close to zero, this function has a defined value! The way we use a limit, then, is by using this idea of approaching a particular value infinitely closely to see what value the function approaches as x gets infinitely close to that value.

So let's use that to look at our function there. First, let's take a graph of f(x):

We can see from this that when approached from the left (where x is < 0), the function goes down, seemingly to -∞, while when approached from the right (x > 0), it goes up:

f(x)=1xlimx0f(x)=Left-handed limitlimx0+f(x)=Right-handed limit

What we have here is a function that has a vertical asymptote at x=0, meaning the function diverges (goes to infinity). It has a left-handed limit (where we "approach" the limit from the left so our variable is always strictly less than the limit we want to find) and a right-handed limit (approaching from the right so that the variable is always strictly greater than the limit we're investigating), but they're different so the two-sided limit (where we would be approaching the value from both sides) doesn't exist.

There's a number of different techniques for determining limits, but we're not going to go too much into it here; it's a big subject and finding really difficult limits can employ some fairly advanced math. In most of the cases we'll be caring about, we'll be employing fairly simple algebra or some basic calculus that can help us; the main thing to know is that if a function is defined at the limit value (ie, you can plug in the limit and get a result out of the function) then the limit is identical to the value of the function at that point.

Next, we're going to talk about the derivative. Let's say we have a big truck that we know goes from 0 to 100 kph in seven seconds. What if we wanted to know what its acceleration was (assuming it acccelerates linearly, ie it doesn't change acceleration)?

Speed vs time

So the acceleration at any given point is equal to the slope of the graph at that point in the function. For this case, the answer is obvious because we already have that, it's the slope of the line which is 100/7, meaning the constant acceleration is 100 k/h/7 s ≈ 3.97 m/s2.

But what if we wanted the slope of a function at some point x for a non-linear function? For instance, a basic parabola, f(x)=x2. What's the slope of the graph at, say, x=3?

Well, let's think about our linear acceleration again. Because the function f(x)=1007x is linear, the acceleration, the rate of change of f(x), is just constant. f(x)=x2, however, is not linear, it changes exponentially, faster and faster, meaning that the rate of change is not constant but is going to be a function in its own right.

That rate of change function is the derivative. For a single-variable function f(x), the derivative of that function is a function that expresses the slope of a tangent line at any given point on the curve. There's a few different kinds of notation for the derivative - notation for the operation of differentiation - but mostly we're going to stick with Leibniz's notation, ddxf(x), or Lagrangian notation, f(x).

f(x)=x2f(x)=2x

Now, explaining how to do differentiation is a massive topic in its own right, but at its base it's defined as a pretty simple limit formula:

f(x)=limh0f(x+h)f(x)h

What this is saying is that we start with the difference between two values of the function separated by some distance between them h, f(x+h)f(x), and divide that by the distance on the x-axis between those two points, creating a function that returns the slope of a line that runs between those two points on the curve. We then take the limit as h, the distance between the two points, approaches zero to get a function that returns the slope of a tangent line to f(x) at any infinitesimal point x. For instance, here's how we would differentiate our function f(x)=x2 using the limit definition:

f(x)=x2f(x)=limh0(x+h)2x2h=limh0x2+2hx+h2x2hExpanding f(x)=limh0h(2x+h)hFactor an h out of the numerator=limh02x+hThis is continuous at h=0 so we can just plug it in!=2x

Calculating derivatives gets tricky fast and using the limit definition obviously gets incredibly unwieldy, so lots of rules and definitions have been discovered over the last few centuries to make getting derivatives easier. It's a topic worth entire textbooks in its own right so I'm going to refer you to Paul's Online Notes at Lamar University on derivatives. We'll be using derivatives throughout but as long as you understand the concepts you shouldn't need to know much of the specifics of differential calculus to follow along. That being said, here's a quick table of some derivative rules that will help you follow along with my blithering:

DescriptionFormulaExample
Power RuleThe derivative of xk in terms of x is kxk-1. We derived a limited version of it just above!ddxxk=kx(k1)ddxx4=4x3
Sine and cosineThe derivative of sin x is cos x, and the derivative of cos x is -sin xddxsinx=cosxddxcosx=sinx
LogarithmsThe derivative of log x (the natural logarithm) is 1/x. The derivative of logax is 1/(x log a). ddxlogx=1xddxlogax=1xloga
Constant Multiple RuleThe derivative of a function times a constant is the same as the derivative of the original function with the constant multiplied in after(Cf)(x)=Cf(x)ddx2x2=2ddxx2=2(2x)=4x
Derivative of multiple termsThe derivative of a function of multiple terms is equal to the sum of the derivatives of each individual term(f+g)(x)=f(x)+g(x)ddxx2+2x=ddxx2+ddx2x=2x+2
Chain RuleHow to differentiate a function nested inside another function(fg)(x)=f(g(x))g(x)f(x)=x2f(x)=2xg(x)=sinxg(x)=cosxddx(sinx)2=ddxf(g(x))ddx(sinx)2=2sinxcosx
Product RuleHow to take the derivative of the product of two functions(fg)(x)=f(x)g(x)+g(x)f(x)f(x)=x2g(x)=sinxddxx2sinx=2xsinx+x2cosx
Quotient RuleHow to differentiate a function divided by another function(fg)(x)=f(x)g(x)f(x)g(x)g(x)2f(x)=x2g(x)=sinxddxx2sinx=2xsinxx2cosx(sinx)2

Now, we've been talking about functions of a single variable so far, f(x). When you take the derivative of a single-variable function, you're always differentiating with respect to the only variable. What about functions with multiple variables? For instance, since we're dealing in physical phenomena here, magnetism and and electricity, we might have functions that deal with points in space, so instead of a single variable, or one variable whose value is strictly a function of another one, we might have two or three variables that we're dealing with in a single function. How do we handle differentiation in that case?

Well, there's a few different approaches depending on what you're trying to accomplish. In many cases you'll only care about the derivative in terms of a single variable in the function, in which case you're taking the partial derivative, which uses the symbol . The partial derivative in terms of a single variable in a multivariate function allows you to isolate the rate of change of that variable alone. In turn, if you want to get the derivative of the entire function in all variables, you're getting what's called the total derivative.

The way a partial derivative works is by treating all variables except the one we're differentiating in terms of as constants. Intuitively, this makes a certain amount of sense; if we isolate one variable and only allow it to vary, we can see the behavior of that variable alone.

As an example, let's say we have a function in two variables, f(x,y)=y3+x2yxy. First, let's graph it to see what we're up against:

(I put in the color map just for looks; the colors just equal the z value, so don't break your brain trying to decipher it. There's no ARG here... OR IS THERE?!)

It's a nice-looking curved surface. Now, let's take the partial derivatives in terms of x and y and see what we get:

f(x,y)=y3+x2yxyxf(x,y)=y(2x1)yf(x,y)=3y2+x2x

Graph of f(x,y)=y(2x - 1)Graph of f(x,y)=3y^2 + x^2 - x

On the left we have the graph of the partial derivative in terms of x, and on the right in terms of y. You can see how the surface has been partially deformed in different ways in both cases but still is recognizable as a transformation of the original surface.

Taking the total derivative is pretty easy in most cases that we will care about. Essentially, if certain conditions are met (conditions we will not go into because it gets real complicated real quickly and it's nothing we're going to need to deal with here), the total derivative is simply taking each partial derivative in turn, one after another. Really interestingly, it doesn't matter what order you take them in either:

f(x,y)=y3+x2yxyxf(x,y)=y(2x1)yxf(x,y)=2x1yf(x,y)=3y2+x2xxyf(x,y)=2x1df=2x1

And a graph of the total derivative, df f(x,y) = 2x - 1:

The total derivative, f(x,y)=2x - 1

If you compare this with the original graph, you can see that it does give a tangent plane to the original function at any (x,y)!

Since this is all groundwork for the Maxwell-Heaviside Equations, we'll mainly be dealing with vector functions, which are effectively multivariate functions, so we'll be dealing with partial derivatives a lot; they have major applications for differential operators. Total derivatives of this type, we won't be dealing with so much.

Now, integration, aka "the hard part". There's a lot of ways to talk about integration, but let's start with the little fact we already talked about right up top: Integration is, in some senses, an inverse operation of differentiation. But what does that mean exactly? We have this interpretation of the derivative of a function as being a function describing the rate of change of the original function. By a similar token, the integral of a function is the area under the function, either over the entire domain of the function (an indefinite integral) or over a specific region (the definite integral).

The last paragraph is a really blockheaded version of the fundamental theorem of calculus, which relates derivation and the antiderivative (what we call the indefinite integral) to definite integration. You don't need to worry about the details of that; everything important about it, I just described and now you have it too.

Anyway, an example! We already know that for f(x)=x2, the derivative is f'(x)=2x. if you look at it in reverse, f(x)=2x, then the antiderivative would be:

f(x)=2xf(x)𝑑x=x2+C

Quick explanation of the notation: The big stretched out S is a big stretched out S and it's the integral sign; it's a Declaration of Independence-ass S because conceptually integration is kind of a fucked-up sum of an infinite number of terms. The "dx" tells us that we're integrating in terms of x, and in particular it's an infinitesimal delta which I will explain in a minute. Finally, the + C is called the constant of integration, and it's there because a math teacher will stab you to death in front of the whole class if you forget it; it's actually important because the derivative of any constant term is zero, meaning that 2x is the derivative of x2 + anything, and so we express that "anything" in the indefinite integral as the arbitrary constant C.

As mentioned, integration can be thought of as a fucked-up infinite sum. Think of it this way: The area under any curve could be approximated by a bunch of rectangles of equal width, whose height is the value of the function at the left (or right, or middle or whatever) side of the rectangle. Something like this:

Riemann sum of x^2

With narrow rectangles like in the graph, we have a pretty good approximation already, but the more rectangles we get, the better approximation we get. As the number of rectangles goes to infinity, the rectangles become infinitesimally thin and the sum of the areas of the infinite number of boxes is exactly equal to the area under the graph. If we say we have k rectangles, and we want the area under a function f(x) from x values a to b, we can describe our approximation as a sum of the areas of those rectangles:

Δx=bakA=n=0k1f(a+nΔx)Δx

This is usually expressed in a slightly different form (a bit more general and a bit less of a sloppy fucking for loop) but broadly speaking this is called a Riemann sum. If we take the limit of A as k goes to infinity, we get a rough limit definition of the Riemann integral. That Δx becomes the little infinitesimal dx in our integral.

Now, much like differentiation, yes, we have a limit definition of integrals (although, in keeping with integration being a much bigger pain in the ass than differentiation, it's quite a bit more complex than the difference quotient and Riemann sums aren't the only way of defining integrals either), but we don't want to try to use it all the time to evaluate integrals. In particular, taking antiderivatives with the limit definition is ugly as hell. So! We have, instead, a whole course worth of integration rules, the most basic ones I'm going to address here, just as I did with the differentiation rules.

DescriptionFormulaExample
Power RuleThe indefinite integral of xk in terms of x is xk+1k+1xk𝑑x=x(k+1)k+1+Cx3𝑑x=x44+C
Sine and cosineInvereses of the derivativessinxdx=cosx+Ccosxdx=sinx+C
LogarithmsThe integral of log x is x log x - x, divided by the log of the base for bases other than the natural logarithmlogxdx=xlogxx+Clogaxdx=xlogxxloga+C
ReciprocalsThe integral of 1/x is log |x|1x𝑑x=log|x|+C
ExponentialsThe exponential function is its own integral (and derivative). For exponents of bases other than *e*, divide by the logarithm of the baseex𝑑x=ex+Ckx𝑑x=kxlogk+C7x𝑑x=7xlog7+C
Constant multiple ruleThe inttegral of a function times a constant is the same as the integral of the original function with the constant multiplied in afterCf(x)𝑑x=Cf(x)𝑑x2x3𝑑x=2x3𝑑x=2x44=x42+C
Integrating multiple termsThe integral of the sum of several terms is equal to the sum of the integrals of each termf(x)+g(x)dx=f(x)𝑑x+g(x)𝑑xx2+2xdx=x2𝑑x+2x𝑑x=x33+x2+C
Integration by partsLess a rule and more of a process, to integrate multiple functions multiplied together you can integrate using something of a reversed chain rulef(x)g(x)𝑑x=f(x)g(x)𝑑x(f(x)g(x)𝑑x)𝑑xx2sinxdx=x2sinxdx(2xsinxdx)𝑑x=x2(cosx)(2x(cosx))𝑑x=x2cosx+2xcosxdx=x2cosx+2(xsinxcosxdxdx)=x2cosx+2(xsinxsinxdx)=x2cosx+2(xsinx+cosx)=2xsinx+2cosxx2cosx+C
Integration by substitutionAnother technique for integrating multiple functions; you can often manipulate an integrand into the right form to substitute a single variable for a more complex function in an integral.f(g(x))g(x)𝑑x=f(u)𝑑uf(x)=sinxg(x)=x2u=x2du=2xdxsin(x2)2x𝑑x=sinudu=cosu+C=cosx2+C

As noted, integrals come in two main flavors, definite integrals and indefinite integrals. The definite integral gives the exact area under a curve for a particular set of endpoints (formally a closed interval). To get the definite integral, there's a few different approaches but the simplest way (assuming the integral is amenable to it and you don't have to use horrible things like contour integration that we're not going to talk about here) is to take the indefinite integral and evaluate it at the endpoints (all that means is you plug in the upper and lower endpoints and subtract the term with the lower endpoint from the term with the upper endpoint). For instance, for our graph of x2 from 0 to 5 up above:

05x2𝑑x=x33|05=533033=1253

Note that we didn't bother with the constant of integration because it always disappears in a definite integral; you can see when we evaluate it, we'd just end up with C - C so there's no point.

Now, we've been talking about single-variable integration, and now it's time to talk about integration in multiple variables. This lets us think about integrating higher-dimensional volumes instead of just 2-d areas.

The first thing we need to know is that there's no such thing as an antiderivative in multiple variables, so we generally don't think about indefinite integrals for multiple integrals. Instead, we extend the definite integral to integrating over a domain, and we notate that with multiple integral signs. For instance, for a domain D:

𝐃2Domain is a subregion of2I=Df(x,y)𝑑x𝑑yThe double integral of f on D𝐃3Domain is a subregion of3I=Df(x,y,z)𝑑x𝑑y𝑑zThe triple integral of f on D

Evaluating multiple integrals is a large topic but we're going to restrict ourselves to the simple case; for most continuous functions over a closed region and other math words we can evaluate these types of multiple integrals as iterated integrals, much like what we did for the derivative. For instance:

f(x,y)=x2+y2𝐃={(x,y)2:1x1;1y1}x is over -1 to 1, y is over the sameI=Df(x,y)𝑑x𝑑y=1111x2+y2dxdy=11x33+xy2|x=11dyIntegrate and evaluate over x…=1123+2y2dy=2y3+2y33|y=11Then over y!=83

Triple integrals work the same way. Additionally, according to Fubini's theorem, for multiple integrals that are continuous in their domains (and so can be evaluated at all), the order of evaluation of the variables doesn't matter, and you'll always get the same result.

Now, there's a lot of complexity we haven't addressed here; for instance, if the domain of the integral is more complicated than a simple rectangle/rectangular prism so that it's defined as a function in its own right, we have more work to do to figure out the endpoints - the bounds of integration - of each iterated integral. We will also run into (you can see it up above) notation like this:

Ω𝐁d𝐒=0

That circle describes this as a surface integral with some special rules on determining the bounds of integration (or, equivalently, transforming the integrand to be able to use linear bounds). We also saw single integral symbols with a circle; that describes an analogous line integral. You don't need to worry too much about these; I'll describe what's going on when we actually deal with them, since there's a limit to how much abstract-seeming bullshit I want to frontload here. Just know that at the end of the day, after we do some minor fuckery these will reduce to exactly the kinds of multiple integrals we've been looking at already and will make sense in context.

Permittivity and permeability

Loosely, permittivity describes the effects of a material on an electric field passing through it. More specifically, it describes the electric polarizability of a material; the higher the permittivity, the more of the electric field it can "store" or "absorb". It's measured in farads/meter, the farad being the SI unit of capacitance. The vacuum permittivity or permittivity of free space, ε0, is an experimentally-derived universal constant describing the permittivity of a classical vacuum; the official name of this is the "electric constant", but that's rather non-descriptive. The official value (more specifically, the CODATA 2018 recommended value) is ε0≈8.854188×10-12 F/m.

Permeability is similar to permittivity, but for magnetic fields. When any material is exposed to a magnetic field, that material will be magnetized by that field. The strength of that magnetization is the permeability of that material, which is measured in henries/meter, where the henry is the SI unit of measure of magnetic inductance; one henry/meter is exactly equal to 1 Newton per amperes squared. Like permittivity, there is an experimentally-derived physical constant for the permeability of free space, the magnetic permeability of a classical vacuum, the magnetic constant μ0, whose CODATA 2018 recommended value is μ0≈1.25663706212×10-6 H/m.

Vector fields and pseudovector fields

A vector field F in ℝ3 is an assignment of a three-dimensional vector F(x,y,z) to each point (x,y,z) of a subset D of ℝ3. The subset D is the domain of the vector field. Alright, so now that we're clear on thHEY HEY STOP THROWING THINGS

Alright, let's try that again: A three-dimensional vector field is basically a function that maps every <x,y,z> point in some three-dimensional domain to another three-dimensional vector.

There's a couple equivalent ways to specify this; we're going to use F notation as the name of the vector function. It can be expressed as a vector whose components are three-variable functions:

F(x,y,z)=P(x,y,z),Q(x,y,z),R(x,y,z)

Or with the scalar functions as coefficients to the standard basis vectors:

F(x,y,z)=P(x,y,z)i^+Q(x,y,z)j^+R(x,y,z)k^

Vector fields can have any number of dimensions; for the purposes of the Maxwell-Heaviside equations, we'll mostly be talking about three-dimensional vectors, but for introductory purposes we'll look at some 2-D vector fields as well. As a simple example, consider the vector field F(x,y)=-y,x. Mathematically, this means that any vector (x,y) in the domain of the vector field is mapped to (-y,x). Now, what does this mean in plain English? Well, let's think about what is happening if we put different values in here. If we put in (0,0), we'll get the same thing back, so no change. The further we get from the origin on the x axis, the larger the y value will be, and similarly the further we get from the origin on the y axis, the larger the x value will be, with its sign inverted. The point (1,0) will be mapped to (0,1), a vector pointing "up" one unit. Similarly, the point (0,1) will be mapped to (-1,0), a vector pointing directly left one unit. Let's graph a few points and see what this looks like graphically:

Graph of the described vector field, showing counterclockwise rotation that gets stronger the further the point is from the origin

This vector field describes counterclockwise rotation that gets stronger the further a point is from the origin. This could be used to model a vortex such as a tornado or a whirlpool, or the angular velocity of a ball on the end of a string as it's swung around at a constant speed. And that's it! That's all a vector field is. It should be easy to imagine how this could be used to mathematically represent a magnetic field (think about any graphical representation of a magnetic field you may have seen) or any other force or set of forces acting over a region.

We also talk about pseudovectors and pseudovector fields; a pseudovector is a vector whose direction is invariant - it doesn't change - when the vector is reflected, unlike "regular" polar vectors; more specifically, in the case of pseudovectors there's an extra sign flip when the pseudovector is reflected. These show up in a physics a lot, there's nothing particularly weird about them and generally you don't have to worry too much about this as long as you understand the physical systems in question and why the math works the way it does, which I'll do my best to make as confusing as possible.

E: The electric field

E

To understand this we need a basic understanding of electrical force. The basics of this are pretty well known! Opposite charges attract, like charges repel. Two charges that are closer to each other attract or repel more strongly, and similarly stronger charges at the same distance will exert more force. The scalar force of two charges acting on each other in free space is given by Coulomb's Law:

F=q1q24πε0r2

Where q1 and q2 are the magnitude of the forces in Coulombs, r is the distance between the two forces in meters (we're using metric everywhere here! None of this Yankee nonsense! I'm not measuring shit in rods and bushels and I'm not using an outhouse either!), and ε0 is our vacuum permittivity constant as above.

Now, the electric field strength, which is what we're really interested in here, is defined as the force exerted on a charge of +1 Coulomb at some distance r from the point charge of strength q:

E=q4πε0r2

That's the scalar field strength at a single point. For the purposes of the equations we're looking at here, however, we generally want the vector field strength; we don't just want the strength of the electrical field, we want the direction of the electrical force as well. We can reformulate this in a vector form, with r being the vector expressing the point at which we want to find the electrical force, rather than just the scalar distance from the point q; |r| is the modulus of the vector r, or, more plainly, its length, which is equivalent to r in the scalar formula:

E=q4πε0·r|r|3

To clarify what we did here, we just multiplied the scalar formula by r|r|, the vector divided by its own length, which gives us a unit vector for the direction of the vector r, and changes the formula from a scalar result to a vector result.

This is a perfectly cromulent formulation of a simple vector field for the strength of a point charge. In fact, later on we will see something like this when we talk about Gauss's law. Importantly, this is a good demonstration of how we can introduce a vector - introducing direction - into a scalar formula, which will be extremely important later on.

B: The magnetic field

B is the magnetic flux density.

K.

Okay, okay, fine. Some materials have an inherent magnetic field; think of a bar magnet. The properties of such a magnetic field are very much like those of an electric field; opposite charges attract, like repels like and so on. From what we've seen so far, it should make sense that this is evidence of a vector field: the magnetic field exerts a force that has both a strength and a direction. We see the exact same behavior with an electromagnet, so we can intuit that these are the same basic effect (and probably have a lot to do with electrical force, too). Based on this, we can say that a permanent magnet or an electrical current exert a directional force on other permanent magnets or electrical currents, and that force is the magnetic field.

Now, having intuited that the magnetic field can be described as a vector field, the next thing is to understand how that directional force actually applies. For the electric field we have Coulomb's law, which in its vector form describes the force and direction exerted by an electrical charge on another charge. For magnetism, we have the somewhat analogous Lorentz force law (in fact, they are not only analogous, but as we will see later are intimately related and form two halves of a whole description of electromagnetic force). The Lorentz force equation is:

𝐅=q𝐯×𝐁

Where F is the vector force exerted on a charge-bearing particle with strength q and velocity and direction v by the magnetic field B. The × indicates that we take the cross product of qv and B, which as discussed above is one of a few different ways of "multiplying" vectors together. We went more in depth above but as a refresher if you skipped it, just know that the cross-product of two vectors, u×v, is a vector w that is perpendicular to the two original vectors.

There's a few interesting elements here. First, the magnetic force exerted depends on the motion of the charge (described by v). That means that a notional entirely stationary magnetic charge has no force exerted on it! Additionally, as noted above, the force exerted is perpendicular to both the magnetic field and the particle's original direction of movement! These are both intuitively extremely weird and have no explanation in classical physics, but are instead emergent properties of quantum electrodynamics.

So why is B the "magnetic flux density"? This is just a conventional name that has to do with how B is used in engineering. A "flux" is the surface integral of a vector field, which I'll explain more about shortly. The original vector field, then, the integrand, can be thought of as the density of the flux at any particular point.

So: The magnetic flux density describes the magnetic field that surrounds a permanent magnet or an electrical current. It's a vector field measured in webers/m2 (also known as Teslas) and we've seen how it's used in the Lorentz force equation, but how do we get the equation for B? Well, as I said, the Lorentz force law is somewhat analogous to Coulomb's law, but in the form above we're solving for F, the force exerted on a particle q. If we solve for B in free space (using the vacuum permeability we discussed above), however, we get a different formulation that should look more familiar:

B=μ0 qv4π|r|2×r|r|

In this case, r is the field point, just as in E, but unlike the electrical field the flux density also depends on the particle described by qv moving through the magnetic field. Remember, the magnetic flux density is always a product of the relative motion of two magnetic fields or electrical charges!

ρ and J: Electric charge density and electric current density

These are simple variables describing electrical charge in a material.

ρ, the electric charge density, describes the amount of a static electric charge in a volume of material. The formula for electric charge density is ρ=qV, where q is, as usual, the charge measured in Coulombs, and V is the volume of material in cubic meters, C/m3.

J is the electric current density, which means the amount of charge per cross-section of the material it's flowing through. Its formula is J=IA, where I is the current in amperes, and A is the cross-sectional area of the conductor the current is traveling through, and it has units of amperes per square meter, A/m2.

Wait, square meters? Yes! Remember, it's a measure of current through a cross-section of the material. For instance, if your current is flowing through a round wire, the area you want to consider is the area of the circular cross section of that wire, so the current density formula would be:

J=Iπr2

With r being the radius of the wire at a particular point; this can be used, if you have the average current of a conductor and its average cross-sectional area, to get the average current density acros the conductor, and can also be used in differential form, which we'll talk about later, to get the instantaneous current.

Finally, and most importantly for our purposes, J as a vector can also be determined in terms of the electric charge density: J=ρv, where ρ is the amount of charge as above, and v is the velocity of the charge through the material. This is exactly equivalent in magnitude to the scalar formula, with the direction being the direction of the flow of positive charges at the point in the conductor represented by v.

Vector field operations and vector calculus

Alright! A lot of what we've talked to up to this point was setting the stage for this; we need to know our basic vector math, but also the basic concepts of vector fields and some of what we're going to be using this for in order to make it easier to talk about.

First, let's look at the main differential vector calculus operator, the del or nabla operator, ∇. The del describes an operation of multiple partial derivatives; it takes a function and produces a vector whose components are the partial derivatives of each dimension of the function. As a vector operator, the way it operates on different types of functions varies; on a scalar function over n variables, it produces an n-dimensional vector. For a scalar function in three variables, f(x,y,z):

f=x𝐢^+y𝐣^+z𝐤^

Where {i^,j^,k^} are our standard basis vectors as above. For example, given f(x,y,z)=x2+y2+z2:

f(x,y,z)=x2+y2+z2f=xf𝐢^+yf𝐣^+zf𝐤^=2x𝐢^+2y𝐣^+2z𝐤^=2x,2y,2z=(2x2y2z)

(This is the last time I'm going to show multiple representations of a vector like this but I wanted to make sure it was very clear what was happening)

This operation by itself is called the gradient, which can be notated as gradf or just f. The gradient of a scalar field f always points in the direction of maximum increase of f and has a magnitude at any point of the maximum rate of increase at that point, just like a regular derivative. Indeed, if it is applied to a single-dimensional function, it should be obvious that it is identical to the normal derivative. To see more clearly, let's graph ∇(x2 + y2) = (2x, 2y):

The graph on the left is a heatmap graph of the scalar field f(x,y)=x2 + y2, where the color at each point represents the value of f(x,y). On the right is a graph of the gradient, ∇f, showing the vectors (sampled obviously) pointing in the direction of maximum increase at each point, with each vector colored its magnitude. At each point, the gradient points "uphill", and the further we get from the origin and the "steeper" the increase gets, the longer the vector gets.

The nabla operator can also be used in a similar way on vector fields, but in this case we have to choose which type of product we're interested in, scalar or vector, resulting in two separate operations, the divergence operator, ∇⋅, and the curl operator, ∇×.

If want a scalar product, we get the divergence operator, divf, notated by ∇⋅. The divergence of a vector field F is a measure how much the vector field is converging to or diverging from a point in the direction of the field at any given point (x,y,z); you can think of it as something of a "reverse" of the gradient, where you put in a vector field and get back a scalar field. Another way to think about it is to think of the vector field as describing the flow of a liquid, and so the divergence at a point describes how much fluid is flowing out from that point vs how much is flowing in. For example, let's take the divergence of a vector field F=x2,y2:

𝐅=x2,y2𝐅=x2x+y2y=2(x+y)

As before, the graph on the left is what we started with, the vector field F=x2,y2, graphed in the same manner as our previous vector field graph. On the right we have a heatmap of the divergence, F. Note that it gets "stronger" as we go up on both axes, in the same direction as the overall tendency of the vector field.

Divergence is just a cheap tactic to make weak vector fields stronger.

Finally, we have the curl operator, curlf or ∇×. The curl of a vector field is probably the most conceptually difficult, and is also the most mathematically complex of these field "quasi-derivatives".

The curl of a vector field describes the rotation of the field at any given point. Once again thinking of the field as a fluid, imagine putting a little paddlewheel in the fluid at some point, and the curl at that point would describe the rotation of the little paddlewheel. Another way to think about it is dropping a toy boat in a stream and seeing where it spins (where vortices form in the water) and where it just drifts straight.

To calculate ∇×f, where f is some three-dimensional vector field:

𝐅(x,y,z)=P(x,y,z),Q(x,y,z),R(x,y,z)×𝐅=RyQz,PzRx,QxPy

Now, one thing to note is that the curl operator is ∇× by analogy with the cross product, but it doesn't have the same properties as the cross product; unlike the cross-product, curl is well-defined on two dimensional vector fields based on the 3-d definition, which is nice for us being able to give it a simple visualization. Going back to the vector field F(x,y)=-y,x, we talked about above and describes a counterclockwise rotation about the origin, the curl of this field would be:

𝐅(x,y)=y,x×𝐅=y0zx,z(y)x0,xxy(y)×𝐅=0,0,2

The curl, the rotation of the field, is positive along the z axis. What this tells us is that the field has rotation in the counterclockwise (positive) direction, which makes sense based on the graph of the vector field:

Graph of the described vector field, showing counterclockwise rotation that gets stronger the further the point is from the origin

If we reverse the direction of the field rotation, we reverse the curl:

𝐅(x,y)=y,x×𝐅=y0z(x),zyx0,x(x)yy×𝐅=0,0,2

The rotation described by the resulting vector is always around the specified axis; for our 2-d fields, the rotation is always around the z-axis sticking straight up from the plane our field is in. A positive value describes counterclockwise rotation around that axis, while a negative value means it's clockwise.


So! That's the mathematical foundations we need to discuss the equations themselves. This is a lot to digest, and was even more to write since I've been working on this since November fucking seventh. Tentatively, I plan to have followup posts for each pair of equations, and will be addressing some of the nitty-gritty of the more complex calculus concepts I glossed over as we deal with them. I have no idea when I'll be getting those so... hey, it can come as a surprise, like a cat dropping a mouse on your doorstep except the mouse is a bunch of inscrutable vector calculus.


You must log in to comment.

in reply to @DecayWTF's post:

in reply to @DecayWTF's post:

I have never felt the fact I only took multivariate calc for three months over a decade ago more intensely than I do right now.

Re: curl, If the positive z-axis points out of my screen towards me how do you go from knowing rotation is positive in the z axis to knowing the rotation is counterclockwise?

Combination of factors here. The direction of the curl is around the axis in question, so for a constant curl of ⟨0,0,2⟩ as we see above, it's around the z axis, on the plane formed by the x and y axes; if you think of it as like a pond, the x and y axes would be on the surface of the pond and the z axis could be marked by a pole sticking up out of the water. Now, because it's rotation around the axis, the direction depends on the coordinate system. Conventional vector math uses a right-handed coordinate system; x increases to the right, y increases up and z increases "toward" you, up off the plane formed by x and y. You can invert any of those axes and get a left-handed coordinate system, in which case positive rotation is clockwise.

Beyond that, if you want to know why the convention is right-handed coordinates: That's how the universe works! Magnetism, for instance, displays right-handed chirality. As far as I know - and if anyone knows different, I would love to know about it! - there's no underlying mathematical reason and left-handed coordinate systems can be used and are used when appropriate.